From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keith Owens Date: Sat, 11 Jun 2005 04:08:56 +0000 Subject: Re: [RFD] Separating struct task and the kernel stacks Message-Id: <11503.1118462936@ocs3.ocs.com.au> List-Id: References: <9712.1118384111@kao2.melbourne.sgi.com> In-Reply-To: <9712.1118384111@kao2.melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-ia64@vger.kernel.org On Fri, 10 Jun 2005 10:03:14 -0700, David Mosberger wrote: >>>>>> On Fri, 10 Jun 2005 08:11:42 -0700 (PDT), Christoph Lameter said: > > >> Switching stacks requires that struct task is copied from the > >> original "current" to the MCA/INIT stack, then change current to > >> point to the new stack. Even that is not enough, there are still > >> places that are using the old value of "current". The main > >> problem is the scheduler, it tracks tasks by the address of their > >> struct task, not by the kernel stack address. When debugging an > >> MCA/INIT, the mismatch between the new value of current and the > >> old task addresses in various structures can lead to some very > >> confusing results. The kernel is not designed to have struct > >> task move around on the fly. > > Christoph> Could you just move the stack? Put a pointer to the stack > Christoph> in task_info. By default this is pointing to the stack in > Christoph> task_info. If you have to switch point it elsewhere. Exactly what I was suggesting. Separate struct task from the stack, so the stack just contains thread_info and the register and memory stacks, with struct task pointing to one of several stacks. But as David has pointed out, that is going to be less efficient. >Perhaps a more fruitful approach might be to treat the MCA as its own >task. That is a promising idea. Preformat the MCA/INIT stacks like the init task, marking them interrupts disabled, non-preemptible etc. To avoid any disagreement with what the scheduler thinks is the system state, mark the MCA/INIT tasks as not running on any cpu, even though they are really in control while they are handling the event. Some of the registers belonging to the interrupted tasks will be in RBS on the MCA/INIT stack, which would normally stop us investigating the original state. The MCA/INIT handler would copy those registers back to the original stack and add a switch_stack to make it look like the original task is blocked. This assumes that the MCA/INIT event occurred while the cpu was running on a kernel stack and that there is enough room on stack to save the state. current and its corresponding DTC still have to be switched to point to the MCA/INIT stack, there are too many places where current is tested and we want almost all of those places to pick up the MCA/INIT state, not the original. For the few cases where we want the original value of current (backtrace is the obvious case), we can detect that this is the MCA/INIT stack and use the original value for current. Stack switching from the MCA/INIT stack to the original stack is no longer required to backtrace the original task, which nicely removes the problem of how to switch between kernel stacks when unwinding, The detection of whether a task is blocked or not would have to change slightly. Currently a task is blocked if it is not on a cpu, which is detected by comparing the task pointer against cpu_rq(cpu)->curr. During an MCA/INIT event, the cpu_rq(cpu)->curr task will be blocked and the MCA/INIT "task" will be active. Fortunately that distinction only affects the MCA/INIT handlers and debuggers like kdb and lcrash, I can live with that.