From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
To: linux-ia64@vger.kernel.org
Subject: wrong initial ia64_kr(current_stack) value
Date: Thu, 09 Oct 2003 15:36:51 +0000 [thread overview]
Message-ID: <marc-linux-ia64-106571413927132@msgid-missing> (raw)
[-- Attachment #1: Type: text/plain, Size: 2249 bytes --]
We start seeing random kernel hang at fairly late stage of booting when lots of processes are spawned by the init script. The kernel used is a variant of 2.4.21. At the time of the hang, one CPU is stuck in page fault handler with no apparent valid dtlb mapping for the kernel stack. Interesting enough, the task that the stuck CPU is executing has its kernel stack allocated out of 16-32MB physical memory range (we are using 16MB kernel granule in this exercise). We finally tracked it down to be a bug in _start() where IA64_KR(CURRENT_STACK) was incorrectly initialized.
This code in head.S is wrong:
mov r16=KERNEL_TR_PAGE_NUM
;;
// load the "current" pointer (r13) and ar.k6 with the current task
mov r13=r2
mov IA64_KR(CURRENT)=r3 // Physical address
// initialize k4 to a safe value (64-128MB is mapped by TR_KERNEL)
mov IA64_KR(CURRENT_STACK)=r16
r16 is loaded with the kernel page number measured in (1<<KERNEL_TR_PAGE_SHIFT) pages, but the check in ia64_switch_to() expects it to be in (1<<IA64_GRANULE_SHIFT) units. When granule size is 16MB, we can hit a problem if the task structure area (the 2 pages allocated for task_stuct and kernel stack area) of the first process that we switch out of idle is in the physical address range [16MB,32MB], as the check in ia64_switch_to() will mistakenly think that we already have this mapping loaded in dtr[2] but actually it doesn't.
The hang will end up with nested TLB fault where secondary DTLB miss for the kernel stack will never complete. The mishap is due to initialization code using the wrong page size when computing the initial value for the "safe" page number that was stored in a kernel register marking which address was mapped for the stack. ia64_switch_to is really confused on which page is mapped in DTR when coming out of idle, because someone lied to him.
I'm surprised that this bug has gone underground for so long. It could happen on any SMP system out there, but it is easier for the bug to bite on a system with lots of CPUs. Here is a patch that fixed problem. Kudos to Tony Luck, Kimi Suganuma and Nomura-san for helping me track this down.
- Ken
p.s. only 2.4.x kernel has this bug.
[-- Attachment #2: stack_tr.patch --]
[-- Type: application/octet-stream, Size: 451 bytes --]
===== arch/ia64/kernel/head.S 1.1 vs edited =====
--- 1.1/arch/ia64/kernel/head.S Fri Oct 3 23:00:15 2003
+++ edited/arch/ia64/kernel/head.S Wed Oct 8 13:37:45 2003
@@ -147,7 +147,7 @@
cmp4.ne isAP,isBP=r3,r0
;; // RAW on r2
extr r3=r2,0,61 // r3 == phys addr of task struct
- mov r16=KERNEL_TR_PAGE_NUM
+ mov r16=(KERNEL_START - PAGE_OFFSET) / IA64_GRANULE_SIZE
;;
// load the "current" pointer (r13) and ar.k6 with the current task
next reply other threads:[~2003-10-09 15:36 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-09 15:36 Chen, Kenneth W [this message]
2003-10-09 17:28 ` wrong initial ia64_kr(current_stack) value Chen, Kenneth W
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=marc-linux-ia64-106571413927132@msgid-missing \
--to=kenneth.w.chen@intel.com \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox