* [BUG] 2.5.69 oops at sysenter_past_esp
@ 2003-05-06 19:52 mikpe
2003-05-06 22:35 ` Dave Jones
0 siblings, 1 reply; 7+ messages in thread
From: mikpe @ 2003-05-06 19:52 UTC (permalink / raw)
To: linux-kernel
Old Dell Latitude with very basic .config: PII, IDE/PIIX, ext2, cardbus,
hotplug, networking, but no SMP, {IO-,}APIC, ACPI, usb.
Booting into a text console, not starting X or inserting cardbus NIC,
suspending the box (apm). At resume, I am immediately greeted with an
oops looking like:
general protection fault: 0000 [#?]
CPU: 0
EIP: 0060:[<c0109079>] Not tainted
EFLAGS: 00010246
EIP is at systenter_past_esp+0x6e/0x71
<register dump>
Process <varies, any one of the daemons>
Stack: ...
Call Trace: <empty>
The machine is almost but not completely dead at this point.
The oops repeats several times with varying intervals (from
seconds up to minutes). The keyboard is initially not dead
(it responds to RET) but it too locks up after a while.
I don't know if this is new in 2.5.69, as I didn't test suspend
with 2.5.68 -- I've had resume-related PS/2 mouse problems with
recent 2.5 kernels, fixed finally by the "psmouse_noext" option.
/Mikael
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [BUG] 2.5.69 oops at sysenter_past_esp 2003-05-06 19:52 [BUG] 2.5.69 oops at sysenter_past_esp mikpe @ 2003-05-06 22:35 ` Dave Jones 2003-05-07 9:33 ` [PATCH] restore sysenter MSRs at resume mikpe 0 siblings, 1 reply; 7+ messages in thread From: Dave Jones @ 2003-05-06 22:35 UTC (permalink / raw) To: mikpe; +Cc: linux-kernel On Tue, May 06, 2003 at 09:52:24PM +0200, mikpe@csd.uu.se wrote: > suspending the box (apm). At resume, I am immediately greeted with an > oops looking like: > > general protection fault: 0000 [#?] > CPU: 0 > EIP: 0060:[<c0109079>] Not tainted > EFLAGS: 00010246 > EIP is at systenter_past_esp+0x6e/0x71 I wonder if your BIOS is trashing the sysenter MSRs on suspend. Maybe they need restoring ? Dave ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] restore sysenter MSRs at resume 2003-05-06 22:35 ` Dave Jones @ 2003-05-07 9:33 ` mikpe 2003-05-07 14:41 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: mikpe @ 2003-05-07 9:33 UTC (permalink / raw) To: Dave Jones; +Cc: torvalds, linux-kernel Dave Jones writes: > On Tue, May 06, 2003 at 09:52:24PM +0200, mikpe@csd.uu.se wrote: > > suspending the box (apm). At resume, I am immediately greeted with an > > oops looking like: > > > > general protection fault: 0000 [#?] > > CPU: 0 > > EIP: 0060:[<c0109079>] Not tainted > > EFLAGS: 00010246 > > EIP is at systenter_past_esp+0x6e/0x71 > > I wonder if your BIOS is trashing the sysenter MSRs on suspend. > Maybe they need restoring ? I've confirmed that that's exatly what's happening. EIP points to the sysexit instruction in entry.S, and the sysenter MSRs are all zero. The patch below hooks sysenter into the driver model and implements a resume() method which restores the sysenter MSRs. On my '98 vintage Latitude, this is necessary since those MSRs are cleared at resume. Failure to restore them leads to oopses and eventual kernel hang. (Of course, your user-space must also use sysenter. RH9 does.) The patch has a debug printk() for problematic systems that require the fix. If it says your machine didn't preserve the MSRs, please post a note about this to LKML with your machine model, so we can estimate the scope of the problem. /Mikael diff -ruN linux-2.5.69/arch/i386/kernel/sysenter.c linux-2.5.69.sysenter-pm/arch/i386/kernel/sysenter.c --- linux-2.5.69/arch/i386/kernel/sysenter.c 2003-05-05 22:56:28.000000000 +0200 +++ linux-2.5.69.sysenter-pm/arch/i386/kernel/sysenter.c 2003-05-07 10:50:39.690468848 +0200 @@ -51,6 +51,53 @@ put_cpu(); } +#ifdef CONFIG_PM +#include <linux/device.h> + +static int sysenter_resume(struct device *dev, u32 state, u32 level) +{ + if (level != RESUME_POWER_ON) + return 0; + /* for collecting statistics, will go away */ + { + unsigned int h, l0, l1, l2; + rdmsr(MSR_IA32_SYSENTER_CS, l0, h); + rdmsr(MSR_IA32_SYSENTER_ESP, l1, h); + rdmsr(MSR_IA32_SYSENTER_EIP, l2, h); + if (!l0 || !l1 || !l2) + printk("sysenter_resume: your BIOS didn't preserve the SYSENTER MSRs\n"); + else + printk("sysenter_resume: congratulations, your BIOS seems Ok\n"); + } + enable_sep_cpu(NULL); + return 0; +} + +static struct device_driver sysenter_driver = { + .name = "sysenter", + .bus = &system_bus_type, + .resume = sysenter_resume, +}; + +static struct sys_device device_sysenter = { + .name = "sysenter", + .id = 0, + .dev = { + .name = "sysenter", + .driver = &sysenter_driver, + }, +}; + +static int __init init_sysenter_devicefs(void) +{ + driver_register(&sysenter_driver); + return sys_device_register(&device_sysenter); +} + +#else /* CONFIG_PM */ +static inline int init_sysenter_devicefs(void) { return 0; } +#endif /* CONFIG_PM */ + /* * These symbols are defined by vsyscall.o to mark the bounds * of the ELF DSO images included therein. @@ -76,7 +123,7 @@ &vsyscall_sysenter_end - &vsyscall_sysenter_start); on_each_cpu(enable_sep_cpu, NULL, 1, 1); - return 0; + return init_sysenter_devicefs(); } __initcall(sysenter_setup); ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] restore sysenter MSRs at resume 2003-05-07 9:33 ` [PATCH] restore sysenter MSRs at resume mikpe @ 2003-05-07 14:41 ` Linus Torvalds 2003-05-07 17:23 ` mikpe 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2003-05-07 14:41 UTC (permalink / raw) To: mikpe; +Cc: Dave Jones, linux-kernel On Wed, 7 May 2003 mikpe@csd.uu.se wrote: > > The patch below hooks sysenter into the driver model and implements > a resume() method which restores the sysenter MSRs. This is wrong. For one thing, you screw up SMP seriously, by not enabling sysenter on all CPU's, only the boot one. For another, we shouldn't have "device drivers" for the CPU. I certainly agree about restoring the sysenter MSR's, but they should be restored by the CPU-specific code long _before_ we start initializing devices. So I think we should just make it part of the CPU initialization (which should be in two parts: the low-level asm part for the "core" CPU registers, and then the high-level C part for things like the MSR's, user-space segment stuff etc). So why not just add an explicit call to "cpu_resume()" in one of the "do_magic_resume()" things, instead of playing games with device trees.. > The patch has a debug printk() for problematic systems that require > the fix. If it says your machine didn't preserve the MSRs, please > post a note about this to LKML with your machine model, so we can > estimate the scope of the problem. I really think that it should be done unconditionally - there's no point in even _expecting_ the BIOS to restore various random MSR's. I can't imagine that many do. Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] restore sysenter MSRs at resume 2003-05-07 14:41 ` Linus Torvalds @ 2003-05-07 17:23 ` mikpe 2003-05-07 17:39 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: mikpe @ 2003-05-07 17:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Jones, linux-kernel Linus Torvalds writes: > > On Wed, 7 May 2003 mikpe@csd.uu.se wrote: > > > > The patch below hooks sysenter into the driver model and implements > > a resume() method which restores the sysenter MSRs. > > This is wrong. > > For one thing, you screw up SMP seriously, by not enabling sysenter on all > CPU's, only the boot one. We don't do apm suspend/resume on SMP, so this is no different from the current situation. I don't know if acpi does it or not. > For another, we shouldn't have "device drivers" for the CPU. I certainly > agree about restoring the sysenter MSR's, but they should be restored by > the CPU-specific code long _before_ we start initializing devices. > > So I think we should just make it part of the CPU initialization (which > should be in two parts: the low-level asm part for the "core" CPU > registers, and then the high-level C part for things like the MSR's, > user-space segment stuff etc). > > So why not just add an explicit call to "cpu_resume()" in one of the > "do_magic_resume()" things, instead of playing games with device trees.. Where would cpu_resume() [and cpu_suspend()] live? arch/i386/kernel/suspend* belong to SOFTWARE_SUSPEND, but I don't think that approach is desirable when apm mostly works for UP. I could probably get away with simply having apm.c invoke the C code in suspend.c, which does restore the SYSENTER MSRs. suspend.c itself doesn't seem to depend on the SOFTWARE_SUSPEND machinery, but suspend_asm.S does. Does that sound reasonable? > > The patch has a debug printk() for problematic systems that require > > the fix. If it says your machine didn't preserve the MSRs, please > > post a note about this to LKML with your machine model, so we can > > estimate the scope of the problem. > > I really think that it should be done unconditionally - there's no point > in even _expecting_ the BIOS to restore various random MSR's. I can't > imagine that many do. It does the restore unconditionally, the check is just informational. /Mikael ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] restore sysenter MSRs at resume 2003-05-07 17:23 ` mikpe @ 2003-05-07 17:39 ` Linus Torvalds 2003-05-08 21:47 ` Pavel Machek 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2003-05-07 17:39 UTC (permalink / raw) To: mikpe; +Cc: Dave Jones, linux-kernel On Wed, 7 May 2003 mikpe@csd.uu.se wrote: > > We don't do apm suspend/resume on SMP, so this is no different from the > current situation. I don't know if acpi does it or not. Well, the thing is, if we ever do want to support it (and I suspect we do), we should have the infrastructure ready. It shouldn't be too hard to support SMP suspend in a 2.7.x timeframe, since it from a technology angle looks like simply hot-plug CPU's. Some of the infrastructure for that already exists. But I seriously doubt we want to do CPU hot-plug as a device driver. Having a hook in place for it in the arch directory will make it easyish to add once we integrate all the other hotplug code (which is very unlikely in the 2.6.x timeframe). > I could probably get away with simply having apm.c invoke the C code > in suspend.c, which does restore the SYSENTER MSRs. suspend.c itself > doesn't seem to depend on the SOFTWARE_SUSPEND machinery, but > suspend_asm.S does. > > Does that sound reasonable? Sounds reasonable to me. In fact, it looks like it really already exists as the current "restore_processor_state()" thing. In fact, that one already _does_ call "enable_sep_cpu()", so what's up? Linus ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] restore sysenter MSRs at resume 2003-05-07 17:39 ` Linus Torvalds @ 2003-05-08 21:47 ` Pavel Machek 0 siblings, 0 replies; 7+ messages in thread From: Pavel Machek @ 2003-05-08 21:47 UTC (permalink / raw) To: Linus Torvalds; +Cc: mikpe, Dave Jones, linux-kernel Hi! > > We don't do apm suspend/resume on SMP, so this is no different from the > > current situation. I don't know if acpi does it or not. > > Well, the thing is, if we ever do want to support it (and I suspect we > do), we should have the infrastructure ready. It shouldn't be too hard to > support SMP suspend in a 2.7.x timeframe, since it from a technology angle > looks like simply hot-plug CPU's. Some of the infrastructure for that > already exists. Actually, then MSRs should restored during hotadd operation, so resume still does *not* care about non-boot cpus... Pavel -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-05-08 22:35 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-05-06 19:52 [BUG] 2.5.69 oops at sysenter_past_esp mikpe 2003-05-06 22:35 ` Dave Jones 2003-05-07 9:33 ` [PATCH] restore sysenter MSRs at resume mikpe 2003-05-07 14:41 ` Linus Torvalds 2003-05-07 17:23 ` mikpe 2003-05-07 17:39 ` Linus Torvalds 2003-05-08 21:47 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox