* Disabling x86 System Management Mode
@ 2007-04-16 10:47 John
2007-04-16 11:31 ` John
2007-04-16 22:12 ` Andi Kleen
0 siblings, 2 replies; 12+ messages in thread
From: John @ 2007-04-16 10:47 UTC (permalink / raw)
To: linux-kernel; +Cc: linux.kernel
Hello everyone,
According to Wikipedia:
http://en.wikipedia.org/wiki/Non-Maskable_Interrupt
http://en.wikipedia.org/wiki/System_Management_Mode
"SMM is an operating mode of the Intel 386SL and later microprocessor in
which all normal execution (including the operating system) is
suspended, and special separate software (usually firmware or a
hardware-assisted debugger) is executed in high-privilege mode.
Operations in SMM take CPU time away from the OS, since the CPU state
must be stored to memory (SMRAM) and any write back caches must be
flushed. This can destroy real-time behavior and cause clock ticks to
get lost."
AFAIU, even a hard real-time OS is "defenseless" against SMIs that
kick the CPU into SMM.
I'm planning on writing a few line of code to gather indirect evidence
that the CPU is periodically entering SMM.
I was considering a kernel module along the lines of...
for (i=0; i < N; ++i)
{
schedule_timeout(1); /* sleep at least one jiffy */
disable interrupts
unsigned cycles = foo();
/* update stats with cycles */
enable interrupts
}
foo is a loop full of NOPs.
It takes up only a few lines in L1 cache.
The conditional jump is easy to predict.
There is a serializing instruction before and after.
As a result, foo's latency should be very consistent.
foo returns the number of cycles it ran. On the systems I have to work
with, foo typically runs in ~800 microseconds.
I suppose disabling interrupts that long is bound to break something
somewhere?
.globl foo
foo:
push %ebx
push %esi
cpuid
rdtsc
mov MM, %ecx
mov %eax, %esi
.align 16
.L1:
nop
nop
... (lots of nops)
dec %ecx
jnz .L1
.L2:
cpuid
rdtsc
sub %esi, %eax
pop %esi
pop %ebx
ret
If foo returns something out of the ordinary, even though interrupts
were disabled, then it must have been interrupted by a non-maskable
interrupt, probably a system management interrupt.
What do you think about this approach?
I'm open to comments and suggestions.
Regards.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Disabling x86 System Management Mode 2007-04-16 10:47 Disabling x86 System Management Mode John @ 2007-04-16 11:31 ` John 2007-04-16 15:12 ` Lee Revell 2007-04-16 22:12 ` Andi Kleen 1 sibling, 1 reply; 12+ messages in thread From: John @ 2007-04-16 11:31 UTC (permalink / raw) To: linux-kernel; +Cc: linux.kernel [-- Attachment #1: Type: text/plain, Size: 1140 bytes --] John wrote: > On the systems I have to work with [...] I didn't specify what they were. CPU: Intel Pentium 3 # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 11 model name : Intel(R) Pentium(R) III CPU - S 1266MHz stepping : 4 cpu MHz : 1266.700 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 2533.40 clflush size : 32 Motherboard: Adlink EBC-2000T http://www.adlinktech.com/PD/web/PD_detail.php?pid=213 Chipset: VIA Pro133T http://www.via.com.tw/en/products/chipsets/legacy/pro133/ VT82C694T north bridge + VT82C686B south bridge AFAIU, the south bridge can be a source of SMIs. Can the north bridge also be a source of SMIs? What I/O ports do I need to write to, and what values should I write to these ports, in order to prevent the VT82C686B from sending SMIs? Regards. [-- Attachment #2: adlink.lspci --] [-- Type: text/plain, Size: 9863 bytes --] 00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- Latency: 8 Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW- AGP3- Rate=x1,x2 Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none> Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- Latency: 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 0000f000-00000fff Memory behind bridge: fff00000-000fffff Prefetchable memory behind bridge: fff00000-000fffff BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B- Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP]) Subsystem: VIA Technologies, Inc. VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 Region 0: [virtual] I/O ports at 01f0 [size=8] Region 1: [virtual] I/O ports at 03f4 Region 2: [virtual] I/O ports at 0170 [size=8] Region 3: [virtual] I/O ports at 0374 Region 4: I/O ports at e400 [size=16] Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:07.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32, cache line size 08 Interrupt: pin D routed to IRQ 5 Region 4: I/O ports at d000 [size=32] Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) Subsystem: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin ? routed to IRQ 9 Capabilities: [68] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:08.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08) Subsystem: Intel Corporation EtherExpress PRO/100B (TX) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min, 14000ns max), cache line size 08 Interrupt: pin A routed to IRQ 10 Region 0: Memory at e6401000 (32-bit, non-prefetchable) [size=4K] Region 1: I/O ports at d400 [size=64] Region 2: Memory at e6200000 (32-bit, non-prefetchable) [size=1M] Expansion ROM at 20000000 [disabled] [size=1M] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=2 PME- 00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08) Subsystem: Intel Corporation EtherExpress PRO/100B (TX) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min, 14000ns max), cache line size 08 Interrupt: pin A routed to IRQ 11 Region 0: Memory at e6400000 (32-bit, non-prefetchable) [size=4K] Region 1: I/O ports at d800 [size=64] Region 2: Memory at e6000000 (32-bit, non-prefetchable) [size=1M] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=2 PME- 00:0a.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 08) Subsystem: Intel Corporation EtherExpress PRO/100B (TX) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min, 14000ns max), cache line size 08 Interrupt: pin A routed to IRQ 12 Region 0: Memory at e6403000 (32-bit, non-prefetchable) [size=4K] Region 1: I/O ports at dc00 [size=64] Region 2: Memory at e6100000 (32-bit, non-prefetchable) [size=1M] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=2 PME- 00:0b.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA]) Subsystem: ATI Technologies Inc Rage XL Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min), cache line size 08 Interrupt: pin A routed to IRQ 5 Region 0: Memory at e5000000 (32-bit, non-prefetchable) [size=16M] Region 1: I/O ports at e000 [size=256] Region 2: Memory at e6402000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at 20100000 [disabled] [size=128K] Capabilities: [5c] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00:0d.0 PCI bridge: Intel Corporation 21152 PCI-to-PCI Bridge (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32, cache line size 08 Bus: primary=00, secondary=02, subordinate=02, sec-latency=32 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: e6300000-e63fffff Prefetchable memory behind bridge: 00000000fff00000-0000000000000000 BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B- Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Bridge: PM- B3+ 02:0f.0 Multimedia controller: PLX Technology, Inc. 9056 PCI I/O Accelerator Subsystem: PHILIPS Business Electronics B.V.: Unknown device d128 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (4000ns min, 6500ns max), cache line size 08 Interrupt: pin A routed to IRQ 5 Region 0: Memory at e6300000 (32-bit, non-prefetchable) [size=512] Region 1: I/O ports at c000 [size=256] Region 2: Memory at e6301000 (32-bit, non-prefetchable) [size=512] Capabilities: [40] Power Management version 1 Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0-,D1+,D2-,D3hot+,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [48] #06 [0000] Capabilities: [4c] Vital Product Data ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-16 11:31 ` John @ 2007-04-16 15:12 ` Lee Revell 0 siblings, 0 replies; 12+ messages in thread From: Lee Revell @ 2007-04-16 15:12 UTC (permalink / raw) To: John; +Cc: linux-kernel On 4/16/07, John <linux.kernel@free.fr> wrote: > Chipset: VIA Pro133T > http://www.via.com.tw/en/products/chipsets/legacy/pro133/ > VT82C694T north bridge + VT82C686B south bridge > > AFAIU, the south bridge can be a source of SMIs. > Can the north bridge also be a source of SMIs? > > What I/O ports do I need to write to, and what values should I write to > these ports, in order to prevent the VT82C686B from sending SMIs? > Check out some of the links in this thread: http://www.embeddedrelated.com/usenet/embedded/show/50333-1.php Lee ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-16 10:47 Disabling x86 System Management Mode John 2007-04-16 11:31 ` John @ 2007-04-16 22:12 ` Andi Kleen 2007-04-17 16:49 ` John Sigler 2007-04-18 11:41 ` John Sigler 1 sibling, 2 replies; 12+ messages in thread From: Andi Kleen @ 2007-04-16 22:12 UTC (permalink / raw) To: John; +Cc: linux-kernel John <linux.kernel@free.fr> writes: Please use a full real name for posting. > AFAIU, even a hard real-time OS is "defenseless" against SMIs that > kick the CPU into SMM. There are usually chipset specific bits that can be set to disable SMMs. See the datasheet if you can get them. Unfortunately most chipset vendors don't give out data sheets easily. > .globl foo > foo: > push %ebx > push %esi > cpuid > rdtsc At least some SMM implementations restore the old TSC value. Sad but true. Besides RDTSC can be speculated around on some CPUs which also adds errors. -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-16 22:12 ` Andi Kleen @ 2007-04-17 16:49 ` John Sigler 2007-04-17 16:57 ` Andi Kleen ` (2 more replies) 2007-04-18 11:41 ` John Sigler 1 sibling, 3 replies; 12+ messages in thread From: John Sigler @ 2007-04-17 16:49 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel, linux.kernel [-- Attachment #1: Type: text/plain, Size: 3740 bytes --] Andi Kleen wrote: > Please use a full real name for posting. OK. > John Sigler wrote: > >> AFAIU, even a hard real-time OS is "defenseless" against SMIs that >> kick the CPU into SMM. > > There are usually chipset specific bits that can be set to disable SMMs. > See the datasheet if you can get them. Unfortunately most chipset vendors > don't give out data sheets easily. I've asked the manufacturer to send me the data sheets. We'll see how they react. >> .globl foo >> foo: >> push %ebx >> push %esi >> cpuid >> rdtsc > > At least some SMM implementations restore the old TSC value. Sad but true. Why would they do that? How would you detect periodic SMM on such a system? > Besides RDTSC can be speculated around on some CPUs which also adds errors. I don't understand this sentence. Could you clarify? I've attached my kernel module (hello.c) # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; cat /proc/interrupts; rmmod houba CPU0 0: 519083 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 9786 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 16920 XT-PIC-XT ide0 NMI: 0 ERR: 0 0.00user 0.00system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+111minor)pagefaults 0swaps CPU0 0: 549094 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 9791 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 16970 XT-PIC-XT ide0 NMI: 0 ERR: 0 (HZ=100) 30011 timer interrupts 5 eth0 interrupts 50 ide0 interrupts # cat /var/log/kern.log Apr 17 18:22:27 SEND kernel: INIT Apr 17 18:27:27 SEND kernel: 2350080 29995 Apr 17 18:27:27 SEND kernel: 2369792 1 Apr 17 18:27:27 SEND kernel: 2440192 1 Apr 17 18:27:27 SEND kernel: 2441216 1 Apr 17 18:27:27 SEND kernel: 2583296 1 Apr 17 18:27:27 SEND kernel: 2852096 1 Apr 17 18:27:27 SEND kernel: EXIT First column is the cycle count clamped to a multiple of 256. (1266.7 MHz CPU) Second column is occurence count. In the second experiment, I added the IRQ disable/enable around foo. # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; cat /proc/interrupts; rmmod houba CPU0 0: 583666 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 10084 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 17012 XT-PIC-XT ide0 NMI: 0 ERR: 0 0.00user 0.01system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+110minor)pagefaults 0swaps CPU0 0: 613677 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 10089 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 17070 XT-PIC-XT ide0 NMI: 0 ERR: 0 30011 timer interrupts 5 eth0 interrupts 58 ide0 interrupts # cat /var/log/kern.log Apr 17 18:33:12 SEND kernel: INIT Apr 17 18:38:12 SEND kernel: 2350080 30000 Apr 17 18:38:12 SEND kernel: EXIT In this experiment, all the calls to foo had a latency between 2350080 and 2350335 cycles (1.88528 to 1.88548 ms). I don't think I can conclude anything based on this experiment. I'll let it run all night long. Regards. [-- Attachment #2: hello.c --] [-- Type: text/plain, Size: 938 bytes --] #include <linux/init.h> #include <linux/module.h> #include <linux/sched.h> #include <linux/slab.h> MODULE_LICENSE("GPL"); #define N 30000 #define MAX 20000 extern unsigned foo(void); static int hello_init(void) { int i; int *lat; printk(KERN_ALERT "INIT\n"); lat = kmalloc(MAX * sizeof *lat, GFP_KERNEL); if (lat == NULL) return -1; for (i=0; i < MAX; ++i) lat[i] = 0; for (i=0; i < 5; ++i) foo(); for (i=0; i < N; ++i) { unsigned count, res; set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(1); foo(); local_irq_disable(); count = foo(); local_irq_enable(); res = count >> 8; if (res < MAX) ++lat[res]; else printk(KERN_ALERT "OUT OF RANGE\n"); } for (i=0; i < MAX; ++i) if (lat[i] != 0) printk(KERN_ALERT "%d %d\n", i<<8, lat[i]); return 0; } static void hello_exit(void) { printk(KERN_ALERT "EXIT\n"); } module_init(hello_init); module_exit(hello_exit); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-17 16:49 ` John Sigler @ 2007-04-17 16:57 ` Andi Kleen 2007-04-17 21:32 ` John Sigler 2007-04-17 17:01 ` John Sigler 2007-04-18 8:09 ` John Sigler 2 siblings, 1 reply; 12+ messages in thread From: Andi Kleen @ 2007-04-17 16:57 UTC (permalink / raw) To: John Sigler; +Cc: Andi Kleen, linux-kernel On Tue, Apr 17, 2007 at 06:49:09PM +0200, John Sigler wrote: > >>.globl foo > >>foo: > >> push %ebx > >> push %esi > >> cpuid > >> rdtsc > > > >At least some SMM implementations restore the old TSC value. Sad but true. > > Why would they do that? I asked the same question. But it has been observed. > How would you detect periodic SMM on such a system? It's not a design goal of SMM to be detectable so the BIOS writers and hardware designers don't care if you can. You could probably try to measure using a external or the LAPIC clock. Or check the chipset bits. > > >Besides RDTSC can be speculated around on some CPUs which also adds errors. > > I don't understand this sentence. Could you clarify? Modern x86 CPUs execute code out of order and in parallel. The reordering window can be quite large and the CPU can execute code speculatively. This can add large errors to RDTSC when the instruction is not executed where you think it is. One way around this is to synchronize it -- using CPUID -- but that also adds latency and makes the measurement less precise. -Andi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-17 16:57 ` Andi Kleen @ 2007-04-17 21:32 ` John Sigler 0 siblings, 0 replies; 12+ messages in thread From: John Sigler @ 2007-04-17 21:32 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel, linux.kernel Andi Kleen wrote: > Modern x86 CPUs execute code out of order and in parallel. I am aware of the (apparent) non-deterministic nature of superscalar out-of-order speculative execution. > The reordering window can be quite large and the CPU can execute code > speculatively. This can add large errors to RDTSC when the > instruction is not executed where you think it is. One way around > this is to synchronize it -- using CPUID -- Your terminology is surprising. CPUID is commonly referred to as a /serializing/ instruction. > but that also adds latency and makes the measurement less precise. Why would adding known latency make the measurement less precise? What I needed was a block of code with consistent latency, whatever the actual number. Regards. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-17 16:49 ` John Sigler 2007-04-17 16:57 ` Andi Kleen @ 2007-04-17 17:01 ` John Sigler 2007-04-18 8:09 ` John Sigler 2 siblings, 0 replies; 12+ messages in thread From: John Sigler @ 2007-04-17 17:01 UTC (permalink / raw) Cc: linux-kernel John Sigler wrote: > static int hello_init(void) > { > int i; > int *lat; > printk(KERN_ALERT "INIT\n"); > lat = kmalloc(MAX * sizeof *lat, GFP_KERNEL); > if (lat == NULL) return -1; > for (i=0; i < MAX; ++i) lat[i] = 0; > for (i=0; i < 5; ++i) foo(); > for (i=0; i < N; ++i) > { > unsigned count, res; > set_current_state(TASK_INTERRUPTIBLE); > schedule_timeout(1); > foo(); > local_irq_disable(); > count = foo(); > local_irq_enable(); > res = count >> 8; > if (res < MAX) ++lat[res]; else printk(KERN_ALERT "OUT OF RANGE\n"); > } > > for (i=0; i < MAX; ++i) > if (lat[i] != 0) printk(KERN_ALERT "%d %d\n", i<<8, lat[i]); kfree(lat); > return 0; > } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-17 16:49 ` John Sigler 2007-04-17 16:57 ` Andi Kleen 2007-04-17 17:01 ` John Sigler @ 2007-04-18 8:09 ` John Sigler 2 siblings, 0 replies; 12+ messages in thread From: John Sigler @ 2007-04-18 8:09 UTC (permalink / raw) To: linux-kernel; +Cc: Andi Kleen, linux.kernel John Sigler wrote: > # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; > cat /proc/interrupts; rmmod houba > CPU0 > 0: 519083 XT-PIC-XT timer > 2: 0 XT-PIC-XT cascade > 9: 0 XT-PIC-XT acpi > 10: 9786 XT-PIC-XT eth0 > 11: 5 XT-PIC-XT eth1 > 12: 5 XT-PIC-XT eth2 > 14: 16920 XT-PIC-XT ide0 > NMI: 0 > ERR: 0 > 0.00user 0.00system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k > 0inputs+0outputs (0major+111minor)pagefaults 0swaps > CPU0 > 0: 549094 XT-PIC-XT timer > 2: 0 XT-PIC-XT cascade > 9: 0 XT-PIC-XT acpi > 10: 9791 XT-PIC-XT eth0 > 11: 5 XT-PIC-XT eth1 > 12: 5 XT-PIC-XT eth2 > 14: 16970 XT-PIC-XT ide0 > NMI: 0 > ERR: 0 > > (HZ=100) > 30011 timer interrupts > 5 eth0 interrupts > 50 ide0 interrupts > > # cat /var/log/kern.log > Apr 17 18:22:27 SEND kernel: INIT > Apr 17 18:27:27 SEND kernel: 2350080 29995 > Apr 17 18:27:27 SEND kernel: 2369792 1 > Apr 17 18:27:27 SEND kernel: 2440192 1 > Apr 17 18:27:27 SEND kernel: 2441216 1 > Apr 17 18:27:27 SEND kernel: 2583296 1 > Apr 17 18:27:27 SEND kernel: 2852096 1 > Apr 17 18:27:27 SEND kernel: EXIT > > First column is the cycle count clamped to a multiple of 256. > (1266.7 MHz CPU) > Second column is occurence count. > > In the second experiment, I added the IRQ disable/enable around foo. > > # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; > cat /proc/interrupts; rmmod houba > CPU0 > 0: 583666 XT-PIC-XT timer > 2: 0 XT-PIC-XT cascade > 9: 0 XT-PIC-XT acpi > 10: 10084 XT-PIC-XT eth0 > 11: 5 XT-PIC-XT eth1 > 12: 5 XT-PIC-XT eth2 > 14: 17012 XT-PIC-XT ide0 > NMI: 0 > ERR: 0 > 0.00user 0.01system 5:00.00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k > 0inputs+0outputs (0major+110minor)pagefaults 0swaps > CPU0 > 0: 613677 XT-PIC-XT timer > 2: 0 XT-PIC-XT cascade > 9: 0 XT-PIC-XT acpi > 10: 10089 XT-PIC-XT eth0 > 11: 5 XT-PIC-XT eth1 > 12: 5 XT-PIC-XT eth2 > 14: 17070 XT-PIC-XT ide0 > NMI: 0 > ERR: 0 > > 30011 timer interrupts > 5 eth0 interrupts > 58 ide0 interrupts > > # cat /var/log/kern.log > Apr 17 18:33:12 SEND kernel: INIT > Apr 17 18:38:12 SEND kernel: 2350080 30000 > Apr 17 18:38:12 SEND kernel: EXIT > > In this experiment, all the calls to foo had a latency between > 2350080 and 2350335 cycles (1.88528 to 1.88548 ms). > > I don't think I can conclude anything based on this experiment. > > I'll let it run all night long. With the IRQ disable/enable around foo: # : >/var/log/kern.log; cat /proc/interrupts; /bin/time insmod houba.ko; cat /proc/interrupts; rmmod houba CPU0 0: 14659 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 144 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 21526 XT-PIC-XT ide0 NMI: 0 ERR: 0 0.00user 0.01system 12:30:00elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+110minor)pagefaults 0swaps CPU0 0: 4516164 XT-PIC-XT timer 2: 0 XT-PIC-XT cascade 9: 0 XT-PIC-XT acpi 10: 414 XT-PIC-XT eth0 11: 5 XT-PIC-XT eth1 12: 5 XT-PIC-XT eth2 14: 21620 XT-PIC-XT ide0 NMI: 0 ERR: 0 4501505 timer interrupts (I had set N to 4500000) 270 eth0 interrupts 94 ide0 interrupts # cat /var/log/kern.log Apr 17 19:03:26 SEND kernel: INIT Apr 18 07:33:26 SEND kernel: 2350080 4500000 Apr 18 07:33:26 SEND kernel: EXIT I'd say the CPU does not appear to enter SMM on this system. (Unless the SMI handler restores the TSC as suggested by Andi.) I need to refine my detection code. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-16 22:12 ` Andi Kleen 2007-04-17 16:49 ` John Sigler @ 2007-04-18 11:41 ` John Sigler 2007-04-18 14:06 ` Andrew Shewmaker 1 sibling, 1 reply; 12+ messages in thread From: John Sigler @ 2007-04-18 11:41 UTC (permalink / raw) To: linux-kernel; +Cc: linux.kernel Andi Kleen wrote: > There are usually chipset specific bits that can be set to disable > SMMs. See the datasheet if you can get them. Unfortunately most > chipset vendors don't give out data sheets easily. I managed to find the south bridge data sheet. http://linux.kernel.free.fr/VT82C686B.pdf (I'm still looking for the north bridge data sheet.) Could someone point me in the right direction? Regards. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-18 11:41 ` John Sigler @ 2007-04-18 14:06 ` Andrew Shewmaker 2007-04-18 14:39 ` John Sigler 0 siblings, 1 reply; 12+ messages in thread From: Andrew Shewmaker @ 2007-04-18 14:06 UTC (permalink / raw) To: John Sigler; +Cc: linux-kernel On 4/18/07, John Sigler <linux.kernel@free.fr> wrote: > Andi Kleen wrote: > > > There are usually chipset specific bits that can be set to disable > > SMMs. See the datasheet if you can get them. Unfortunately most > > chipset vendors don't give out data sheets easily. > > I managed to find the south bridge data sheet. > > http://linux.kernel.free.fr/VT82C686B.pdf > > (I'm still looking for the north bridge data sheet.) > > Could someone point me in the right direction? The LinuxBIOS folks had a thread on SMM recently. You don't have to worry about SMM if you can use their stuff. http://www.linuxbios.org/pipermail/linuxbios/2006-December/017861.html same thread, but continued http://www.linuxbios.org/pipermail/linuxbios/2007-January/017905.html -- Andrew Shewmaker ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Disabling x86 System Management Mode 2007-04-18 14:06 ` Andrew Shewmaker @ 2007-04-18 14:39 ` John Sigler 0 siblings, 0 replies; 12+ messages in thread From: John Sigler @ 2007-04-18 14:39 UTC (permalink / raw) To: linux-kernel; +Cc: Andrew Shewmaker Andrew Shewmaker wrote: > John Sigler wrote: > >> Andi Kleen wrote: >> >>> There are usually chipset specific bits that can be set to disable >>> SMMs. See the datasheet if you can get them. Unfortunately most >>> chipset vendors don't give out data sheets easily. >> >> I managed to find the south bridge data sheet. >> >> http://linux.kernel.free.fr/VT82C686B.pdf >> >> (I'm still looking for the north bridge data sheet.) >> >> Could someone point me in the right direction? > > The LinuxBIOS folks had a thread on SMM recently. You don't have to > worry about SMM if you can use their stuff. Unfortunately, LinuxBIOS is not a viable option at this point. I have to live with this proprietary BIOS. Phoenix AwardBIOS v6.00PG EBC-2000 REV: A1.3 02/12/2004-694X-686-6A6LJXA9C-00 Regards. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-04-18 14:39 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-16 10:47 Disabling x86 System Management Mode John 2007-04-16 11:31 ` John 2007-04-16 15:12 ` Lee Revell 2007-04-16 22:12 ` Andi Kleen 2007-04-17 16:49 ` John Sigler 2007-04-17 16:57 ` Andi Kleen 2007-04-17 21:32 ` John Sigler 2007-04-17 17:01 ` John Sigler 2007-04-18 8:09 ` John Sigler 2007-04-18 11:41 ` John Sigler 2007-04-18 14:06 ` Andrew Shewmaker 2007-04-18 14:39 ` John Sigler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox