* tip/master broken with x2apic and kexec @ 2010-07-13 2:59 Yinghai Lu 2010-07-13 3:29 ` Yinghai Lu 2010-07-13 22:00 ` tip/master broken with x2apic and kexec H. Peter Anvin 0 siblings, 2 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-13 2:59 UTC (permalink / raw) To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Suresh Siddha Cc: linux-kernel@vger.kernel.org tip/master: system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. linus' tree is ok. but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. the problem looks start in recent two or three weeks. Any idea? bisecting will take a while, because the system post take a while everytime. Thanks Yinghai Lu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-13 2:59 tip/master broken with x2apic and kexec Yinghai Lu @ 2010-07-13 3:29 ` Yinghai Lu 2010-07-13 6:40 ` H. Peter Anvin 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: Fix one more open-coded 8-bit alternative number tip-bot for H. Peter Anvin 2010-07-13 22:00 ` tip/master broken with x2apic and kexec H. Peter Anvin 1 sibling, 2 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-13 3:29 UTC (permalink / raw) To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Suresh Siddha Cc: linux-kernel@vger.kernel.org On 07/12/2010 07:59 PM, Yinghai Lu wrote: > tip/master: > system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. > > system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. > > linus' tree is ok. > > but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. > > the problem looks start in recent two or three weeks. > offending patch is commit 83a7a2ad2a9173dcabc05df0f01d1d85b7ba1c2c Author: H. Peter Anvin <hpa@linux.intel.com> Date: Thu Jun 10 00:10:43 2010 +0000 x86, alternatives: Use 16-bit numbers for cpufeature index We already have cpufeature indicies above 255, so use a 16-bit number for the alternatives index. This consumes a padding field and so doesn't add any size, but it means that abusing the padding field to create assembly errors on overflow no longer works. We can retain the test simply by redirecting it to the .discard section, however. [ v3: updated to include open-coded locations ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-13 3:29 ` Yinghai Lu @ 2010-07-13 6:40 ` H. Peter Anvin 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: Fix one more open-coded 8-bit alternative number tip-bot for H. Peter Anvin 1 sibling, 0 replies; 19+ messages in thread From: H. Peter Anvin @ 2010-07-13 6:40 UTC (permalink / raw) To: Yinghai Lu Cc: Ingo Molnar, Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/12/2010 08:29 PM, Yinghai Lu wrote: > > offending patch is > > commit 83a7a2ad2a9173dcabc05df0f01d1d85b7ba1c2c > Author: H. Peter Anvin<hpa@linux.intel.com> > Date: Thu Jun 10 00:10:43 2010 +0000 > > x86, alternatives: Use 16-bit numbers for cpufeature index > > We already have cpufeature indicies above 255, so use a 16-bit number > for the alternatives index. This consumes a padding field and so > doesn't add any size, but it means that abusing the padding field to > create assembly errors on overflow no longer works. We can retain the > test simply by redirecting it to the .discard section, however. > > [ v3: updated to include open-coded locations ] > > Signed-off-by: H. Peter Anvin<hpa@linux.intel.com> > LKML-Reference:<tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org> > Signed-off-by: H. Peter Anvin<hpa@zytor.com> > Oh, good grief, what the hell is wrong with it this time... -hpa ^ permalink raw reply [flat|nested] 19+ messages in thread
* [tip:x86/alternatives] x86, alternatives: Fix one more open-coded 8-bit alternative number 2010-07-13 3:29 ` Yinghai Lu 2010-07-13 6:40 ` H. Peter Anvin @ 2010-07-14 0:54 ` tip-bot for H. Peter Anvin 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: BUG on encountering an invalid CPU feature number tip-bot for H. Peter Anvin 1 sibling, 1 reply; 19+ messages in thread From: tip-bot for H. Peter Anvin @ 2010-07-14 0:54 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, yinhai, suresh.b.siddha, tglx, hpa Commit-ID: df378ccfc4dd04e263426ad805516915874774aa Gitweb: http://git.kernel.org/tip/df378ccfc4dd04e263426ad805516915874774aa Author: H. Peter Anvin <hpa@linux.intel.com> AuthorDate: Tue, 13 Jul 2010 14:55:11 -0700 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Tue, 13 Jul 2010 14:56:16 -0700 x86, alternatives: Fix one more open-coded 8-bit alternative number Fix a missing case of an 8-bit alternative number, buried inside an assembly macro. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reported-by: Yinghai Lu <yinhai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4C3BDDA3.2060900@kernel.org> --- arch/x86/lib/copy_user_64.S | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 71100c9..a460158 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -29,7 +29,7 @@ .align 8 .quad 0b .quad 2b - .byte \feature /* when feature is set */ + .word \feature /* when feature is set */ .byte 5 .byte 5 .previous ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [tip:x86/alternatives] x86, alternatives: BUG on encountering an invalid CPU feature number 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: Fix one more open-coded 8-bit alternative number tip-bot for H. Peter Anvin @ 2010-07-14 0:54 ` tip-bot for H. Peter Anvin 0 siblings, 0 replies; 19+ messages in thread From: tip-bot for H. Peter Anvin @ 2010-07-14 0:54 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, yinhai, suresh.b.siddha, tglx, hpa Commit-ID: 3b770a2128423a687e6e9c57184a584fb4ba4c77 Gitweb: http://git.kernel.org/tip/3b770a2128423a687e6e9c57184a584fb4ba4c77 Author: H. Peter Anvin <hpa@linux.intel.com> AuthorDate: Tue, 13 Jul 2010 14:57:50 -0700 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Tue, 13 Jul 2010 14:57:50 -0700 x86, alternatives: BUG on encountering an invalid CPU feature number Make the alternatives-patching code BUG on encountering an invalid CPU feature number. Should have done this a long time ago. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Yinghai Lu <yinhai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <tip-df378ccfc4dd04e263426ad805516915874774aa@git.kernel.org> --- arch/x86/kernel/alternative.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 7023773..f65ab8b 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -214,6 +214,7 @@ void __init_or_module apply_alternatives(struct alt_instr *start, u8 *instr = a->instr; BUG_ON(a->replacementlen > a->instrlen); BUG_ON(a->instrlen > sizeof(insnbuf)); + BUG_ON(a->cpuid >= NCAPINTS*32); if (!boot_cpu_has(a->cpuid)) continue; #ifdef CONFIG_X86_64 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-13 2:59 tip/master broken with x2apic and kexec Yinghai Lu 2010-07-13 3:29 ` Yinghai Lu @ 2010-07-13 22:00 ` H. Peter Anvin 2010-07-13 23:27 ` Yinghai Lu 1 sibling, 1 reply; 19+ messages in thread From: H. Peter Anvin @ 2010-07-13 22:00 UTC (permalink / raw) To: Yinghai Lu Cc: Ingo Molnar, Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/12/2010 07:59 PM, Yinghai Lu wrote: > tip/master: > system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. > > system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. > > linus' tree is ok. > > but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. > > the problem looks start in recent two or three weeks. > > Any idea? > > bisecting will take a while, because the system post take a while everytime. > > Thanks > > Yinghai Lu OK, I found the bug... if you could test out the patch which will be sent out shortly I would very much appreciate it. -hpa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-13 22:00 ` tip/master broken with x2apic and kexec H. Peter Anvin @ 2010-07-13 23:27 ` Yinghai Lu 2010-07-14 20:35 ` Yinghai Lu 0 siblings, 1 reply; 19+ messages in thread From: Yinghai Lu @ 2010-07-13 23:27 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/13/2010 03:00 PM, H. Peter Anvin wrote: > On 07/12/2010 07:59 PM, Yinghai Lu wrote: >> tip/master: >> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >> >> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >> >> linus' tree is ok. >> >> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >> >> the problem looks start in recent two or three weeks. >> >> Any idea? >> >> bisecting will take a while, because the system post take a while everytime. >> >> Thanks >> >> Yinghai Lu > > OK, I found the bug... if you could test out the patch which will be > sent out shortly I would very much appreciate it. not sure if your patch is the offending one now. kL: kernel from linus tree kT1: kernel from tip kT2: kernel from tip with reverting your patch BIOS-->kL ---> kL ---> kL....always working BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working looks like first kernel and second one can not be kernel from tip. Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-13 23:27 ` Yinghai Lu @ 2010-07-14 20:35 ` Yinghai Lu 2010-07-14 21:05 ` Don Zickus 2010-07-14 21:23 ` Yinghai Lu 0 siblings, 2 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-14 20:35 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Don Zickus, Frederic Weisbecker Cc: Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/13/2010 04:27 PM, Yinghai Lu wrote: > On 07/13/2010 03:00 PM, H. Peter Anvin wrote: >> On 07/12/2010 07:59 PM, Yinghai Lu wrote: >>> tip/master: >>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >>> >>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >>> >>> linus' tree is ok. >>> >>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >>> >>> the problem looks start in recent two or three weeks. >>> >>> Any idea? >>> >>> bisecting will take a while, because the system post take a while everytime. >>> >>> Thanks >>> >>> Yinghai Lu >> >> OK, I found the bug... if you could test out the patch which will be >> sent out shortly I would very much appreciate it. > > not sure if your patch is the offending one now. > > kL: kernel from linus tree > kT1: kernel from tip > kT2: kernel from tip with reverting your patch > > BIOS-->kL ---> kL ---> kL....always working > BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... > BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... > > BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working > BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working > bisecting said: > git bisect good 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b Author: Don Zickus <dzickus@redhat.com> Date: Fri May 7 17:11:44 2010 -0400 lockup_detector: Combine nmi_watchdog and softlockup detector The new nmi_watchdog (which uses the perf event subsystem) is very similar in structure to the softlockup detector. Using Ingo's suggestion, I combined the two functionalities into one file: kernel/watchdog.c. Now both the nmi_watchdog (or hardlockup detector) and softlockup detector sit on top of the perf event subsystem, which is run every 60 seconds or so to see if there are any lockups. To detect hardlockups, cpus not responding to interrupts, I implemented an hrtimer that runs 5 times for every perf event overflow event. If that stops counting on a cpu, then the cpu is most likely in trouble. To detect softlockups, tasks not yielding to the scheduler, I used the previous kthread idea that now gets kicked every time the hrtimer fires. If the kthread isn't being scheduled neither is anyone else and the warning is printed to the console. I tested this on x86_64 and both the softlockup and hardlockup paths work. V2: - cleaned up the Kconfig and softlockup combination - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI - seperated out the softlockup case from perf event subsystem - re-arranged the enabling/disabling nmi watchdog from proc space - added cpumasks for hardlockup failure cases - removed fallback to soft events if no PMU exists for hard events V3: - comment cleanups - drop support for older softlockup code - per_cpu cleanups - completely remove software clock base hardlockup detector - use per_cpu masking on hard/soft lockup detection - #ifdef cleanups - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR - documentation additions V4: - documentation fixes - convert per_cpu to __get_cpu_var - powerpc compile fixes V5: - split apart warn flags for hard and soft lockups TODO: - figure out how to make an arch-agnostic clock2cycles call (if possible) to feed into perf events as a sample period [fweisbec: merged conflict patch] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> :040000 040000 c99baa531fdcc45b1cc4d2d3257c9a848067961b 637cfd2034d694e3fdcb0eb0b52b705d71b5078a M Documentation :040000 040000 0844d6f54293ec10af53a1d5ff64053dc9585a02 acb13a89b3f58130ef9677160e73b7121095da84 M arch :040000 040000 9b7508dba6d0a76cbec9d6c7ed82820e8c4f2a97 8016330e23998f9dfdce2512556e8a795d66aa55 M include :040000 040000 e6ec48f3f0314aff9a6a46706772ccd26d901830 ad70b3b8d21c8114096c8a5675393f1ab11457f5 M init :040000 040000 a4456db9fbda918e06e68e573f18b51f388182db ace18da3199572a1fbc2c0800a2d65f22050ff8c M kernel :040000 040000 120bb994855546e2e0003e54e3a382663994c00d 0e7721b41acd86ecae6ddf3c2aa6b836543aacb3 M lib > git bisect log git bisect start # bad: [6058b92b74c529f7234b92492bf634f52707a8c0] Merge branch 'x86/setup' git bisect bad 6058b92b74c529f7234b92492bf634f52707a8c0 # good: [1c5474a65bf15a4cb162dfff86d6d0b5a08a740c] Linux 2.6.35-rc5 git bisect good 1c5474a65bf15a4cb162dfff86d6d0b5a08a740c # good: [f12813390bebee04bbd0a070592ce57648805493] Merge branch 'tracing/urgent' git bisect good f12813390bebee04bbd0a070592ce57648805493 # bad: [e8eb3808c6bd8d78895f6b61d4a36d8346818aad] Merge branch 'x86/urgent' git bisect bad e8eb3808c6bd8d78895f6b61d4a36d8346818aad # good: [bb8beea5d4df37ccfb0359329dc0053a82f38501] Merge branch 'linus' git bisect good bb8beea5d4df37ccfb0359329dc0053a82f38501 # bad: [24e5c8ccb4d187c7a05cb77c3ac004581ad16f26] Merge branch 'linus' git bisect bad 24e5c8ccb4d187c7a05cb77c3ac004581ad16f26 # bad: [fbde9fccc1a9da261f9f786338af10edbbfb7eb8] Merge branch 'irq/core' git bisect bad fbde9fccc1a9da261f9f786338af10edbbfb7eb8 # good: [a9a58f907d8650db1c650688cddbecfe481f91d7] Merge branch 'perf/core' git bisect good a9a58f907d8650db1c650688cddbecfe481f91d7 # bad: [89d7ce2a2178e7f562f608b466a18c8c2ece87af] lockup_detector: Make BOOTPARAM_SOFTLOCKUP_PANIC depend on LOCKUP_DETECTOR git bisect bad 89d7ce2a2178e7f562f608b466a18c8c2ece87af # bad: [2508ce1845a3b256798532b2c6b7997c2dc6533b] lockup_detector: Remove old softlockup code git bisect bad 2508ce1845a3b256798532b2c6b7997c2dc6533b # bad: [58687acba59266735adb8ccd9b5b9aa2c7cd205b] lockup_detector: Combine nmi_watchdog and softlockup detector git bisect bad 58687acba59266735adb8ccd9b5b9aa2c7cd205b # good: [a9aa1d02de36b450990b0e25a88fc2ff1c3e6b94] Merge commit 'v2.6.34-rc7' into perf/nmi git bisect good a9aa1d02de36b450990b0e25a88fc2ff1c3e6b94 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-14 20:35 ` Yinghai Lu @ 2010-07-14 21:05 ` Don Zickus 2010-07-14 22:07 ` Yinghai Lu 2010-07-14 21:23 ` Yinghai Lu 1 sibling, 1 reply; 19+ messages in thread From: Don Zickus @ 2010-07-14 21:05 UTC (permalink / raw) To: Yinghai Lu Cc: H. Peter Anvin, Ingo Molnar, Frederic Weisbecker, Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On Wed, Jul 14, 2010 at 01:35:44PM -0700, Yinghai Lu wrote: > On 07/13/2010 04:27 PM, Yinghai Lu wrote: > > On 07/13/2010 03:00 PM, H. Peter Anvin wrote: > >> On 07/12/2010 07:59 PM, Yinghai Lu wrote: > >>> tip/master: > >>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. > >>> > >>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. > >>> > >>> linus' tree is ok. > >>> > >>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. > >>> > >>> the problem looks start in recent two or three weeks. > >>> > >>> Any idea? > >>> > >>> bisecting will take a while, because the system post take a while everytime. > >>> > >>> Thanks > >>> > >>> Yinghai Lu > >> > >> OK, I found the bug... if you could test out the patch which will be > >> sent out shortly I would very much appreciate it. > > > > not sure if your patch is the offending one now. > > > > kL: kernel from linus tree > > kT1: kernel from tip > > kT2: kernel from tip with reverting your patch > > > > BIOS-->kL ---> kL ---> kL....always working > > BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... > > BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... > > > > BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working > > BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working > > > > bisecting said: > > > git bisect good > 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit > commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b > Author: Don Zickus <dzickus@redhat.com> > Date: Fri May 7 17:11:44 2010 -0400 What do you mean by instant reboot? This code isn't really exercised until the cpus come online. I'll dig through the history of this thread to see if there is a boot log or something to look at. Cheers, Don ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-14 21:05 ` Don Zickus @ 2010-07-14 22:07 ` Yinghai Lu 0 siblings, 0 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-14 22:07 UTC (permalink / raw) To: Don Zickus Cc: H. Peter Anvin, Ingo Molnar, Frederic Weisbecker, Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/14/2010 02:05 PM, Don Zickus wrote: > On Wed, Jul 14, 2010 at 01:35:44PM -0700, Yinghai Lu wrote: >> On 07/13/2010 04:27 PM, Yinghai Lu wrote: >>> On 07/13/2010 03:00 PM, H. Peter Anvin wrote: >>>> On 07/12/2010 07:59 PM, Yinghai Lu wrote: >>>>> tip/master: >>>>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >>>>> >>>>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >>>>> >>>>> linus' tree is ok. >>>>> >>>>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >>>>> >>>>> the problem looks start in recent two or three weeks. >>>>> >>>>> Any idea? >>>>> >>>>> bisecting will take a while, because the system post take a while everytime. >>>>> >>>>> Thanks >>>>> >>>>> Yinghai Lu >>>> >>>> OK, I found the bug... if you could test out the patch which will be >>>> sent out shortly I would very much appreciate it. >>> >>> not sure if your patch is the offending one now. >>> >>> kL: kernel from linus tree >>> kT1: kernel from tip >>> kT2: kernel from tip with reverting your patch >>> >>> BIOS-->kL ---> kL ---> kL....always working >>> BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... >>> BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... >>> >>> BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working >>> BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working >>> >> >> bisecting said: >> >>> git bisect good >> 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit >> commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b >> Author: Don Zickus <dzickus@redhat.com> >> Date: Fri May 7 17:11:44 2010 -0400 > > What do you mean by instant reboot? This code isn't really exercised > until the cpus come online. when call kexec -e get Starting kernel ... then should have second kernel booting instead, I get BIOS post. Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-14 20:35 ` Yinghai Lu 2010-07-14 21:05 ` Don Zickus @ 2010-07-14 21:23 ` Yinghai Lu 2010-07-14 22:57 ` Yinghai Lu 1 sibling, 1 reply; 19+ messages in thread From: Yinghai Lu @ 2010-07-14 21:23 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Don Zickus, Frederic Weisbecker Cc: Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/14/2010 01:35 PM, Yinghai Lu wrote: > On 07/13/2010 04:27 PM, Yinghai Lu wrote: >> On 07/13/2010 03:00 PM, H. Peter Anvin wrote: >>> On 07/12/2010 07:59 PM, Yinghai Lu wrote: >>>> tip/master: >>>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >>>> >>>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >>>> >>>> linus' tree is ok. >>>> >>>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >>>> >>>> the problem looks start in recent two or three weeks. >>>> >>>> Any idea? >>>> >>>> bisecting will take a while, because the system post take a while everytime. >>>> >>>> Thanks >>>> >>>> Yinghai Lu >>> >>> OK, I found the bug... if you could test out the patch which will be >>> sent out shortly I would very much appreciate it. >> >> not sure if your patch is the offending one now. >> >> kL: kernel from linus tree >> kT1: kernel from tip >> kT2: kernel from tip with reverting your patch >> >> BIOS-->kL ---> kL ---> kL....always working >> BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... >> BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... >> >> BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working >> BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working >> > > bisecting said: > >> git bisect good > 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit > commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b > Author: Don Zickus <dzickus@redhat.com> > Date: Fri May 7 17:11:44 2010 -0400 > > lockup_detector: Combine nmi_watchdog and softlockup detector > > The new nmi_watchdog (which uses the perf event subsystem) is very > similar in structure to the softlockup detector. Using Ingo's > suggestion, I combined the two functionalities into one file: > kernel/watchdog.c. > > Now both the nmi_watchdog (or hardlockup detector) and softlockup > detector sit on top of the perf event subsystem, which is run every > 60 seconds or so to see if there are any lockups. > > To detect hardlockups, cpus not responding to interrupts, I > implemented an hrtimer that runs 5 times for every perf event > overflow event. If that stops counting on a cpu, then the cpu is > most likely in trouble. > > To detect softlockups, tasks not yielding to the scheduler, I used the > previous kthread idea that now gets kicked every time the hrtimer fires. > If the kthread isn't being scheduled neither is anyone else and the > warning is printed to the console. > > I tested this on x86_64 and both the softlockup and hardlockup paths > work. > with # CONFIG_LOCKUP_DETECTOR is not set # CONFIG_HARDLOCKUP_DETECTOR is not set kexec loop test could passed. also that patch will break x2apic preenabled system 's kexec/kdump. Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-14 21:23 ` Yinghai Lu @ 2010-07-14 22:57 ` Yinghai Lu 2010-07-15 0:03 ` Suresh Siddha 0 siblings, 1 reply; 19+ messages in thread From: Yinghai Lu @ 2010-07-14 22:57 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Don Zickus, Frederic Weisbecker Cc: Thomas Gleixner, Suresh Siddha, linux-kernel@vger.kernel.org On 07/14/2010 02:23 PM, Yinghai Lu wrote: > On 07/14/2010 01:35 PM, Yinghai Lu wrote: >> On 07/13/2010 04:27 PM, Yinghai Lu wrote: >>> On 07/13/2010 03:00 PM, H. Peter Anvin wrote: >>>> On 07/12/2010 07:59 PM, Yinghai Lu wrote: >>>>> tip/master: >>>>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >>>>> >>>>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >>>>> >>>>> linus' tree is ok. >>>>> >>>>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >>>>> >>>>> the problem looks start in recent two or three weeks. >>>>> >>>>> Any idea? >>>>> >>>>> bisecting will take a while, because the system post take a while everytime. >>>>> >>>>> Thanks >>>>> >>>>> Yinghai Lu >>>> >>>> OK, I found the bug... if you could test out the patch which will be >>>> sent out shortly I would very much appreciate it. >>> >>> not sure if your patch is the offending one now. >>> >>> kL: kernel from linus tree >>> kT1: kernel from tip >>> kT2: kernel from tip with reverting your patch >>> >>> BIOS-->kL ---> kL ---> kL....always working >>> BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... >>> BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... >>> >>> BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working >>> BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working >>> >> >> bisecting said: >> >>> git bisect good >> 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit >> commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b >> Author: Don Zickus <dzickus@redhat.com> >> Date: Fri May 7 17:11:44 2010 -0400 >> >> lockup_detector: Combine nmi_watchdog and softlockup detector >> >> The new nmi_watchdog (which uses the perf event subsystem) is very >> similar in structure to the softlockup detector. Using Ingo's >> suggestion, I combined the two functionalities into one file: >> kernel/watchdog.c. >> >> Now both the nmi_watchdog (or hardlockup detector) and softlockup >> detector sit on top of the perf event subsystem, which is run every >> 60 seconds or so to see if there are any lockups. >> >> To detect hardlockups, cpus not responding to interrupts, I >> implemented an hrtimer that runs 5 times for every perf event >> overflow event. If that stops counting on a cpu, then the cpu is >> most likely in trouble. >> >> To detect softlockups, tasks not yielding to the scheduler, I used the >> previous kthread idea that now gets kicked every time the hrtimer fires. >> If the kthread isn't being scheduled neither is anyone else and the >> warning is printed to the console. >> >> I tested this on x86_64 and both the softlockup and hardlockup paths >> work. >> > > with > # CONFIG_LOCKUP_DETECTOR is not set > # CONFIG_HARDLOCKUP_DETECTOR is not set > > kexec loop test could passed. > > also that patch will break x2apic preenabled system 's kexec/kdump. before the combining patch CONFIG_DETECT_SOFTLOCKUP=y CONFIG_NMI_WATCHDOG=y will have the same problem. so the problem should come from NMI_WATCHDOG. Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-14 22:57 ` Yinghai Lu @ 2010-07-15 0:03 ` Suresh Siddha 2010-07-15 2:01 ` Yinghai Lu 2010-07-15 7:00 ` [PATCH] x86: fix x2apic preenabled system with kexec Yinghai Lu 0 siblings, 2 replies; 19+ messages in thread From: Suresh Siddha @ 2010-07-15 0:03 UTC (permalink / raw) To: Yinghai Lu Cc: H. Peter Anvin, Ingo Molnar, Don Zickus, Frederic Weisbecker, Thomas Gleixner, linux-kernel@vger.kernel.org On Wed, 2010-07-14 at 15:57 -0700, Yinghai Lu wrote: > On 07/14/2010 02:23 PM, Yinghai Lu wrote: > > On 07/14/2010 01:35 PM, Yinghai Lu wrote: > >> On 07/13/2010 04:27 PM, Yinghai Lu wrote: > >>> On 07/13/2010 03:00 PM, H. Peter Anvin wrote: > >>>> On 07/12/2010 07:59 PM, Yinghai Lu wrote: > >>>>> tip/master: > >>>>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. > >>>>> > >>>>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. > >>>>> > >>>>> linus' tree is ok. > >>>>> > >>>>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. > >>>>> > >>>>> the problem looks start in recent two or three weeks. > >>>>> > >>>>> Any idea? > >>>>> > >>>>> bisecting will take a while, because the system post take a while everytime. > >>>>> > >>>>> Thanks > >>>>> > >>>>> Yinghai Lu > >>>> > >>>> OK, I found the bug... if you could test out the patch which will be > >>>> sent out shortly I would very much appreciate it. > >>> > >>> not sure if your patch is the offending one now. > >>> > >>> kL: kernel from linus tree > >>> kT1: kernel from tip > >>> kT2: kernel from tip with reverting your patch > >>> > >>> BIOS-->kL ---> kL ---> kL....always working > >>> BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... > >>> BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... > >>> > >>> BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working > >>> BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working > >>> > >> > >> bisecting said: > >> > >>> git bisect good > >> 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit > >> commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b > >> Author: Don Zickus <dzickus@redhat.com> > >> Date: Fri May 7 17:11:44 2010 -0400 > >> > >> lockup_detector: Combine nmi_watchdog and softlockup detector > >> > >> The new nmi_watchdog (which uses the perf event subsystem) is very > >> similar in structure to the softlockup detector. Using Ingo's > >> suggestion, I combined the two functionalities into one file: > >> kernel/watchdog.c. > >> > >> Now both the nmi_watchdog (or hardlockup detector) and softlockup > >> detector sit on top of the perf event subsystem, which is run every > >> 60 seconds or so to see if there are any lockups. > >> > >> To detect hardlockups, cpus not responding to interrupts, I > >> implemented an hrtimer that runs 5 times for every perf event > >> overflow event. If that stops counting on a cpu, then the cpu is > >> most likely in trouble. > >> > >> To detect softlockups, tasks not yielding to the scheduler, I used the > >> previous kthread idea that now gets kicked every time the hrtimer fires. > >> If the kthread isn't being scheduled neither is anyone else and the > >> warning is printed to the console. > >> > >> I tested this on x86_64 and both the softlockup and hardlockup paths > >> work. > >> > > > > with > > # CONFIG_LOCKUP_DETECTOR is not set > > # CONFIG_HARDLOCKUP_DETECTOR is not set > > > > kexec loop test could passed. > > > > also that patch will break x2apic preenabled system 's kexec/kdump. > > before the combining patch > > CONFIG_DETECT_SOFTLOCKUP=y > CONFIG_NMI_WATCHDOG=y > > will have the same problem. > > so the problem should come from NMI_WATCHDOG. Yinghai, It looks like some timing issue wrt nmi handling/kexec and perhaps not directly related to x2apic? Perhaps we should try with x2apic disabled but with intr-remapping enabled etc to see if it changes anything. Also do we know (like serial console log etc) how far ahead we went in the kexec before we rebooted? thanks, suresh ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: tip/master broken with x2apic and kexec 2010-07-15 0:03 ` Suresh Siddha @ 2010-07-15 2:01 ` Yinghai Lu 2010-07-15 7:00 ` [PATCH] x86: fix x2apic preenabled system with kexec Yinghai Lu 1 sibling, 0 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-15 2:01 UTC (permalink / raw) To: Suresh Siddha Cc: H. Peter Anvin, Ingo Molnar, Don Zickus, Frederic Weisbecker, Thomas Gleixner, linux-kernel@vger.kernel.org On 07/14/2010 05:03 PM, Suresh Siddha wrote: > On Wed, 2010-07-14 at 15:57 -0700, Yinghai Lu wrote: >> On 07/14/2010 02:23 PM, Yinghai Lu wrote: >>> On 07/14/2010 01:35 PM, Yinghai Lu wrote: >>>> On 07/13/2010 04:27 PM, Yinghai Lu wrote: >>>>> On 07/13/2010 03:00 PM, H. Peter Anvin wrote: >>>>>> On 07/12/2010 07:59 PM, Yinghai Lu wrote: >>>>>>> tip/master: >>>>>>> system1: BIOS enabled x2apic, first kernel boot well, and when kexec second kernel will cause system instant reboot. >>>>>>> >>>>>>> system2: BIOS not enable x2apic, first kernel boot well and enable x2apic, and kexec second kernel well. but when kexec third kernel will case system instant reboot. >>>>>>> >>>>>>> linus' tree is ok. >>>>>>> >>>>>>> but for system2 if boot with nox2apic ,intr-remaping off, iommu off, the kexec loop test will pass. >>>>>>> >>>>>>> the problem looks start in recent two or three weeks. >>>>>>> >>>>>>> Any idea? >>>>>>> >>>>>>> bisecting will take a while, because the system post take a while everytime. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Yinghai Lu >>>>>> >>>>>> OK, I found the bug... if you could test out the patch which will be >>>>>> sent out shortly I would very much appreciate it. >>>>> >>>>> not sure if your patch is the offending one now. >>>>> >>>>> kL: kernel from linus tree >>>>> kT1: kernel from tip >>>>> kT2: kernel from tip with reverting your patch >>>>> >>>>> BIOS-->kL ---> kL ---> kL....always working >>>>> BIOS-->kT1 ---> kT1 ---> kT1 : between second one and third one system reset instant... >>>>> BIOS-->kT2 ---> kT2 ---> kT2 : between second one and third one system reset instant... >>>>> >>>>> BIOS-->kL ---> kL ---> kL ---> then kT1 ---> kT1 .... always working >>>>> BIOS-->kL ---> kL ---> kL ---> then kT2 ---> kT2 .... always working >>>>> >>>> >>>> bisecting said: >>>> >>>>> git bisect good >>>> 58687acba59266735adb8ccd9b5b9aa2c7cd205b is the first bad commit >>>> commit 58687acba59266735adb8ccd9b5b9aa2c7cd205b >>>> Author: Don Zickus <dzickus@redhat.com> >>>> Date: Fri May 7 17:11:44 2010 -0400 >>>> >>>> lockup_detector: Combine nmi_watchdog and softlockup detector >>>> >>>> The new nmi_watchdog (which uses the perf event subsystem) is very >>>> similar in structure to the softlockup detector. Using Ingo's >>>> suggestion, I combined the two functionalities into one file: >>>> kernel/watchdog.c. >>>> >>>> Now both the nmi_watchdog (or hardlockup detector) and softlockup >>>> detector sit on top of the perf event subsystem, which is run every >>>> 60 seconds or so to see if there are any lockups. >>>> >>>> To detect hardlockups, cpus not responding to interrupts, I >>>> implemented an hrtimer that runs 5 times for every perf event >>>> overflow event. If that stops counting on a cpu, then the cpu is >>>> most likely in trouble. >>>> >>>> To detect softlockups, tasks not yielding to the scheduler, I used the >>>> previous kthread idea that now gets kicked every time the hrtimer fires. >>>> If the kthread isn't being scheduled neither is anyone else and the >>>> warning is printed to the console. >>>> >>>> I tested this on x86_64 and both the softlockup and hardlockup paths >>>> work. >>>> >>> >>> with >>> # CONFIG_LOCKUP_DETECTOR is not set >>> # CONFIG_HARDLOCKUP_DETECTOR is not set >>> >>> kexec loop test could passed. >>> >>> also that patch will break x2apic preenabled system 's kexec/kdump. >> >> before the combining patch >> >> CONFIG_DETECT_SOFTLOCKUP=y >> CONFIG_NMI_WATCHDOG=y >> >> will have the same problem. >> >> so the problem should come from NMI_WATCHDOG. > > Yinghai, It looks like some timing issue wrt nmi handling/kexec and > perhaps not directly related to x2apic? Perhaps we should try with > x2apic disabled but with intr-remapping enabled etc to see if it changes > anything. only have "nox2apic", without "nointremap intel_iommu=off" the kexec loop test work well. So it is x2apic, nmi_watchdog related... Also do we know (like serial console log etc) how far ahead we > went in the kexec before we rebooted? will add more printk after "Starting new kernel" to check it. Thanks Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] x86: fix x2apic preenabled system with kexec 2010-07-15 0:03 ` Suresh Siddha 2010-07-15 2:01 ` Yinghai Lu @ 2010-07-15 7:00 ` Yinghai Lu 2010-07-15 18:16 ` Suresh Siddha 2010-07-17 0:48 ` [tip:x86/urgent] x86: Fix " tip-bot for Yinghai Lu 1 sibling, 2 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-15 7:00 UTC (permalink / raw) To: Suresh Siddha, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Andrew Morton Cc: Don Zickus, Frederic Weisbecker, linux-kernel@vger.kernel.org, stable Found one x2apic system kexec loop test failed when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip) first kernel can kexec second kernel, but second kernel can not kexec third one. it can be duplicated on another system with BIOS preenabled x2apic. First kernel can not kexec second kernel. It turns out, when kernel boot with pre-enabled x2apic, it will not execute disable_local_APIC on shutdown path. when init_apic_mappings() is called in setup_arch, it will skip setting of apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic()) Then later, disable_local_APIC() will bail out early because !apic_phys. So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys. another solution could be updating init_apic_mappings() to set apic_phys even for preenabled x2apic system. Actually even for x2apic system, that lapic address is mapped already in early stage. BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org --- arch/x86/kernel/apic/apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/x86/kernel/apic/apic.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apic/apic.c +++ linux-2.6/arch/x86/kernel/apic/apic.c @@ -921,7 +921,7 @@ void disable_local_APIC(void) unsigned int value; /* APIC hasn't been mapped yet */ - if (!apic_phys) + if (!x2apic_mode && !apic_phys) return; clear_local_APIC(); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86: fix x2apic preenabled system with kexec 2010-07-15 7:00 ` [PATCH] x86: fix x2apic preenabled system with kexec Yinghai Lu @ 2010-07-15 18:16 ` Suresh Siddha 2010-07-15 20:10 ` Yinghai Lu 2010-07-17 0:48 ` [tip:x86/urgent] x86: Fix " tip-bot for Yinghai Lu 1 sibling, 1 reply; 19+ messages in thread From: Suresh Siddha @ 2010-07-15 18:16 UTC (permalink / raw) To: Yinghai Lu Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Andrew Morton, Don Zickus, Frederic Weisbecker, linux-kernel@vger.kernel.org, stable On Thu, 2010-07-15 at 00:00 -0700, Yinghai Lu wrote: > Found one x2apic system kexec loop test failed > when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip) > > first kernel can kexec second kernel, but second kernel can not kexec third one. > > it can be duplicated on another system with BIOS preenabled x2apic. > First kernel can not kexec second kernel. > > It turns out, when kernel boot with pre-enabled x2apic, it will not execute > disable_local_APIC on shutdown path. > > when init_apic_mappings() is called in setup_arch, it will skip setting of > apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic()) > Then later, disable_local_APIC() will bail out early because !apic_phys. > > So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys. Thanks for the nice debug work! As we still have NMI enabled, it looks like we get a NMI during kexec and as we reset gdt/idt before kexec launch, we might get a triple fault causing the system to reboot. > another solution could be updating init_apic_mappings() to set apic_phys even > for preenabled x2apic system. Actually even for x2apic system, that lapic > address is mapped already in early stage. Below patch is the right one. We should probably unmap apic_phys mapping when x2apic is enabled by the OS. > BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? I am not sure. There might be one. Is there any bug which can't handle this condition? > > Signed-off-by: Yinghai Lu <yinghai@kernel.org> > Cc: stable@kernel.org For this patch: Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> > --- > arch/x86/kernel/apic/apic.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6/arch/x86/kernel/apic/apic.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/apic/apic.c > +++ linux-2.6/arch/x86/kernel/apic/apic.c > @@ -921,7 +921,7 @@ void disable_local_APIC(void) > unsigned int value; > > /* APIC hasn't been mapped yet */ > - if (!apic_phys) > + if (!x2apic_mode && !apic_phys) > return; > > clear_local_APIC(); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86: fix x2apic preenabled system with kexec 2010-07-15 18:16 ` Suresh Siddha @ 2010-07-15 20:10 ` Yinghai Lu 2010-07-15 20:40 ` Yinghai Lu 0 siblings, 1 reply; 19+ messages in thread From: Yinghai Lu @ 2010-07-15 20:10 UTC (permalink / raw) To: Suresh Siddha Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Andrew Morton, Don Zickus, Frederic Weisbecker, linux-kernel@vger.kernel.org, stable On 07/15/2010 11:16 AM, Suresh Siddha wrote: > On Thu, 2010-07-15 at 00:00 -0700, Yinghai Lu wrote: > >> BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? > > I am not sure. There might be one. Is there any bug which can't handle > this condition? We merged apic_ops into struct apic a while ago. so even for system with x2apic preenabled by BIOS, x2apic_cluster/phys is set to apic until smp_pare_cpus()::default_setup_apic_routing() after enable_IR_x2apic. that means the boot cpu x2apic is accessed via memmap instead msr based way at that point. not sure if the Boot apic id is bigger than 255. read_apic() for apic id could be wrong. ( in early_acpi_boot_init, acpi_boot_init, init_apic_mappings) looks like we need to re-read boot_cpu_physical_apicid or we could assign x2apic_cluster/phys in check_x2apic(), and later if intr_remapping can not be enabed, we can revert back to phys_flat or flat? Thanks Yinghai Lu ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] x86: fix x2apic preenabled system with kexec 2010-07-15 20:10 ` Yinghai Lu @ 2010-07-15 20:40 ` Yinghai Lu 0 siblings, 0 replies; 19+ messages in thread From: Yinghai Lu @ 2010-07-15 20:40 UTC (permalink / raw) To: Suresh Siddha Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Andrew Morton, Don Zickus, Frederic Weisbecker, linux-kernel@vger.kernel.org, stable On 07/15/2010 01:10 PM, Yinghai Lu wrote: > On 07/15/2010 11:16 AM, Suresh Siddha wrote: >> On Thu, 2010-07-15 at 00:00 -0700, Yinghai Lu wrote: >> >>> BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? >> >> I am not sure. There might be one. Is there any bug which can't handle >> this condition? > > We merged apic_ops into struct apic a while ago. > > so even for system with x2apic preenabled by BIOS, x2apic_cluster/phys is set to apic until smp_pare_cpus()::default_setup_apic_routing() after enable_IR_x2apic. > > that means the boot cpu x2apic is accessed via memmap instead msr based way at that point. > > not sure if the Boot apic id is bigger than 255. read_apic() for apic id could be wrong. ( in early_acpi_boot_init, acpi_boot_init, init_apic_mappings) > > looks like we need to re-read boot_cpu_physical_apicid > or we could assign x2apic_cluster/phys in check_x2apic(), and later if intr_remapping can not be enabed, we can revert back to phys_flat or flat? > never mind, early_acpi_boot_init::acpi_parse_madt::default_acpi_madt_oem_check() will handle the case. it will set the apic to apic_x2apic_... accordingly if x2apic is preenabled. Thanks Yinghai ^ permalink raw reply [flat|nested] 19+ messages in thread
* [tip:x86/urgent] x86: Fix x2apic preenabled system with kexec 2010-07-15 7:00 ` [PATCH] x86: fix x2apic preenabled system with kexec Yinghai Lu 2010-07-15 18:16 ` Suresh Siddha @ 2010-07-17 0:48 ` tip-bot for Yinghai Lu 1 sibling, 0 replies; 19+ messages in thread From: tip-bot for Yinghai Lu @ 2010-07-17 0:48 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, yinghai, suresh.b.siddha, tglx, hpa Commit-ID: fd19dce7ac07973f700b0f13fb7f94b951414a4c Gitweb: http://git.kernel.org/tip/fd19dce7ac07973f700b0f13fb7f94b951414a4c Author: Yinghai Lu <yinghai@kernel.org> AuthorDate: Thu, 15 Jul 2010 00:00:59 -0700 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Fri, 16 Jul 2010 16:49:41 -0700 x86: Fix x2apic preenabled system with kexec Found one x2apic system kexec loop test failed when CONFIG_NMI_WATCHDOG=y (old) or CONFIG_LOCKUP_DETECTOR=y (current tip) first kernel can kexec second kernel, but second kernel can not kexec third one. it can be duplicated on another system with BIOS preenabled x2apic. First kernel can not kexec second kernel. It turns out, when kernel boot with pre-enabled x2apic, it will not execute disable_local_APIC on shutdown path. when init_apic_mappings() is called in setup_arch, it will skip setting of apic_phys when x2apic_mode is set. ( x2apic_mode is much early check_x2apic()) Then later, disable_local_APIC() will bail out early because !apic_phys. So check !x2apic_mode in x2apic_mode in disable_local_APIC with !apic_phys. another solution could be updating init_apic_mappings() to set apic_phys even for preenabled x2apic system. Actually even for x2apic system, that lapic address is mapped already in early stage. BTW: is there any x2apic preenabled system with apicid of boot cpu > 255? Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C3EB22B.3000701@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: stable@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> --- arch/x86/kernel/apic/apic.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index c02cc69..a96489e 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -921,7 +921,7 @@ void disable_local_APIC(void) unsigned int value; /* APIC hasn't been mapped yet */ - if (!apic_phys) + if (!x2apic_mode && !apic_phys) return; clear_local_APIC(); ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2010-07-17 0:49 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-07-13 2:59 tip/master broken with x2apic and kexec Yinghai Lu 2010-07-13 3:29 ` Yinghai Lu 2010-07-13 6:40 ` H. Peter Anvin 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: Fix one more open-coded 8-bit alternative number tip-bot for H. Peter Anvin 2010-07-14 0:54 ` [tip:x86/alternatives] x86, alternatives: BUG on encountering an invalid CPU feature number tip-bot for H. Peter Anvin 2010-07-13 22:00 ` tip/master broken with x2apic and kexec H. Peter Anvin 2010-07-13 23:27 ` Yinghai Lu 2010-07-14 20:35 ` Yinghai Lu 2010-07-14 21:05 ` Don Zickus 2010-07-14 22:07 ` Yinghai Lu 2010-07-14 21:23 ` Yinghai Lu 2010-07-14 22:57 ` Yinghai Lu 2010-07-15 0:03 ` Suresh Siddha 2010-07-15 2:01 ` Yinghai Lu 2010-07-15 7:00 ` [PATCH] x86: fix x2apic preenabled system with kexec Yinghai Lu 2010-07-15 18:16 ` Suresh Siddha 2010-07-15 20:10 ` Yinghai Lu 2010-07-15 20:40 ` Yinghai Lu 2010-07-17 0:48 ` [tip:x86/urgent] x86: Fix " tip-bot for Yinghai Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).