* [Linux-ia64] sigaltstack and RBS
@ 2003-02-09 5:19 Matt Chapman
2003-02-09 5:27 ` David Mosberger
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: Matt Chapman @ 2003-02-09 5:19 UTC (permalink / raw)
To: linux-ia64
I'm having some difficulty "demand paging" register backing store from
userspace (i.e. using SIGSEGV to map pages in on demand).
The problem is that even when using sigaltstack, the original backing
store (which caused the fault) is still touched when returning to the
signal trampoline, before it switches to the alternate RBS. Thus I
get recursive faulting before it gets to the signal handler.
Ideally, signal handling on an alternate RBS/stack wouldn't touch the
original RBS/stack at all.
Any suggestions how to deal with this?
Matt
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman @ 2003-02-09 5:27 ` David Mosberger 2003-02-09 5:47 ` Matt Chapman ` (8 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: David Mosberger @ 2003-02-09 5:27 UTC (permalink / raw) To: linux-ia64 >>>>> On Sun, 9 Feb 2003 16:19:20 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: Matt> I'm having some difficulty "demand paging" register backing Matt> store from userspace (i.e. using SIGSEGV to map pages in on Matt> demand). Matt> The problem is that even when using sigaltstack, the original Matt> backing store (which caused the fault) is still touched when Matt> returning to the signal trampoline, before it switches to the Matt> alternate RBS. Thus I get recursive faulting before it gets Matt> to the signal handler. Matt> Ideally, signal handling on an alternate RBS/stack wouldn't Matt> touch the original RBS/stack at all. Matt> Any suggestions how to deal with this? It sounds like you're using an old kernel. I don't recall exactly when this was fixed, but recent kernels will put the dirty partition on the _new_ stack, not the old one. You can check gate.S:ia64_sigtramp(). If it branches to "setup_rbs" _before_ the first "alloc" instruction, you should be fine. --david ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman 2003-02-09 5:27 ` David Mosberger @ 2003-02-09 5:47 ` Matt Chapman 2003-02-09 7:58 ` Matt Chapman ` (7 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: Matt Chapman @ 2003-02-09 5:47 UTC (permalink / raw) To: linux-ia64 On Sat, Feb 08, 2003 at 09:27:54PM -0800, David Mosberger wrote: > > It sounds like you're using an old kernel. Sorry, I should have given more information. I'm using 2.5.59-ia64-030124. > I don't recall exactly > when this was fixed, but recent kernels will put the dirty partition > on the _new_ stack, not the old one. > > You can check gate.S:ia64_sigtramp(). If it branches to "setup_rbs" > _before_ the first "alloc" instruction, you should be fine. Yes, it does. Hmm, the faults I'm getting are at ia64_sigtramp (exactly). The first instruction (add r2@,r12) doesn't look like it could possibly be the cause, so I presume it is the RFI that is forcing a frame to be loaded (?). Matt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman 2003-02-09 5:27 ` David Mosberger 2003-02-09 5:47 ` Matt Chapman @ 2003-02-09 7:58 ` Matt Chapman 2003-02-09 8:48 ` David Mosberger ` (6 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: Matt Chapman @ 2003-02-09 7:58 UTC (permalink / raw) To: linux-ia64 [-- Attachment #1: Type: text/plain, Size: 36 bytes --] Here's a small test program. Matt [-- Attachment #2: test.c --] [-- Type: text/plain, Size: 971 bytes --] #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <sys/mman.h> #include <asm/page.h> #define RBS_BASE (void *)0x60000fff80000000UL #define RBS_SIZE PAGE_SIZE static void mprotect_rbs(int flags) { mprotect(RBS_BASE, RBS_SIZE, flags); } static void SIGSEGV_handler(int sig, siginfo_t *info, void *uc) { #if 1 abort(); #endif mprotect_rbs(PROT_READ|PROT_WRITE); } int main(void) { struct sigaction sa; stack_t newstack; static unsigned long stack[SIGSTKSZ/sizeof(unsigned long)] __attribute__((aligned(PAGE_SIZE))) /* paranoia */; newstack.ss_sp = stack; newstack.ss_flags = 0; newstack.ss_size = sizeof(stack); if (sigaltstack(&newstack, NULL) == -1) { perror("sigaltstack"); return 1; } sigemptyset(&sa.sa_mask); sa.sa_flags = SA_ONSTACK | SA_SIGINFO; sa.sa_sigaction = SIGSEGV_handler; if (sigaction(SIGSEGV, &sa, NULL) == -1) { perror("sigaction"); return 1; } mprotect_rbs(0); printf("Still alive!\n"); return 0; } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (2 preceding siblings ...) 2003-02-09 7:58 ` Matt Chapman @ 2003-02-09 8:48 ` David Mosberger 2003-02-09 10:55 ` Matt Chapman ` (5 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: David Mosberger @ 2003-02-09 8:48 UTC (permalink / raw) To: linux-ia64 >>>>> On Sun, 9 Feb 2003 18:58:06 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: Matt> Here's a small test program. Hmmh, the test program doesn't test backing-store _overflow_, it tests what happens when you _remove_ a formerly valid mapping. The program fails because the "rfi" that gets executed when returning from mprotect() may end up trying to restore registers that got spilled to the backing store before the call mprotect(), the mprotect() then removes access permission and hence the "rfi" can never finish execution (effectively, the mprotect() makes the contents of the spilled stacked registers disappear for good). The current sigaltstack implementation isn't designed to handle such a case. And I'm not sure whether it should. Is there a particular reason you want to do this sort of thing? --david ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (3 preceding siblings ...) 2003-02-09 8:48 ` David Mosberger @ 2003-02-09 10:55 ` Matt Chapman 2003-02-09 18:22 ` David Mosberger ` (4 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: Matt Chapman @ 2003-02-09 10:55 UTC (permalink / raw) To: linux-ia64 On Sun, Feb 09, 2003 at 12:48:40AM -0800, David Mosberger wrote: > > The current sigaltstack implementation isn't designed to handle such a > case. And I'm not sure whether it should. Is there a particular > reason you want to do this sort of thing? I'll explain the context. I've written a virtual machine monitor which currently (for ease of prototyping) runs completely in userspace. e.g. itc does an mmap, ptc does an munmap, changing RID unmaps a whole region, SIGSEGV delivers a TLB miss to the "guest" kernel. Now after a flush or RID change the guest kernel returns to its userspace with ar.bspstore pointing off to somewhere that isn't mapped, expecting to get a fault eventually. This is where the problem occurs. A mandatory RSE load faults as expected and the kernel tries to deliver SIGSEGV. But then the RFI to the signal trampoline repeats the same RSE load that caused the fault in the first place, before the signal handler can deal with it. Is there any reason that the signal trampoline needs to see the original frame, or would it suffice to give it an empty frame? (Hmm, presumably this would mean filling out sc_cfm in the kernel... how to do that if we're in a syscall and haven't done the cover?) Matt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (4 preceding siblings ...) 2003-02-09 10:55 ` Matt Chapman @ 2003-02-09 18:22 ` David Mosberger 2003-02-11 3:06 ` David Mosberger ` (3 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: David Mosberger @ 2003-02-09 18:22 UTC (permalink / raw) To: linux-ia64 >>>>> On Sun, 9 Feb 2003 21:55:50 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: Matt> I'll explain the context. I've written a virtual machine Matt> monitor which currently (for ease of prototyping) runs Matt> completely in userspace. e.g. itc does an mmap, ptc does an Matt> munmap, changing RID unmaps a whole region, SIGSEGV delivers a Matt> TLB miss to the "guest" kernel. Matt> Now after a flush or RID change the guest kernel returns to Matt> its userspace with ar.bspstore pointing off to somewhere that Matt> isn't mapped, expecting to get a fault eventually. This is Matt> where the problem occurs. A mandatory RSE load faults as Matt> expected and the kernel tries to deliver SIGSEGV. But then Matt> the RFI to the signal trampoline repeats the same RSE load Matt> that caused the fault in the first place, before the signal Matt> handler can deal with it. Ah, that does sound interesting. Actually, my explanation from yesterday can't be right: the current register frame as of the time of the mandatory RSE fault by definition is NOT on the user backing store, so the "rfi" shouldn't trigger any mandatory loads (the user's current frame is restored by the "loadrs" in the kernel's exit path). Matt> Is there any reason that the signal trampoline needs to see Matt> the original frame, or would it suffice to give it an empty Matt> frame? (Hmm, presumably this would mean filling out sc_cfm in Matt> the kernel... how to do that if we're in a syscall and Matt> haven't done the cover?) Someone needs to take care of backing the user's dirty partition. Since the user's original stack may not be valid, it needs to be backed by the alternate signal stack. That's most easily done by the signal trampoline after switching to the new backing store. I'm not sure off-hand what's wrong. I'll take a look Monday. For what it's worth, the test program does seem to work fine on 2.5.52. --david ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (5 preceding siblings ...) 2003-02-09 18:22 ` David Mosberger @ 2003-02-11 3:06 ` David Mosberger 2003-02-11 7:26 ` Matt Chapman ` (2 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: David Mosberger @ 2003-02-11 3:06 UTC (permalink / raw) To: linux-ia64 >>>>> On Sun, 9 Feb 2003 16:19:20 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: Matt> I'm having some difficulty "demand paging" register backing Matt> store from userspace (i.e. using SIGSEGV to map pages in on Matt> demand). Matt> The problem is that even when using sigaltstack, the original Matt> backing store (which caused the fault) is still touched when Matt> returning to the signal trampoline, before it switches to the Matt> alternate RBS. Thus I get recursive faulting before it gets Matt> to the signal handler. Matt> Ideally, signal handling on an alternate RBS/stack wouldn't Matt> touch the original RBS/stack at all. Matt> Any suggestions how to deal with this? OK, I looked into this. The patch below should fix the problem. Bjorn, you might want to consider applying the same patch for 2.4.xx (with some extra testing). The problem was that the signal delivery code wasn't prepared to handle incomplete register frames. Such frames are the result of mandatory RSE loads which fault. This situation is a corner-case because when a mandatory RSE load faults, the triggering instruction has been executed already (except for the restoration of the current frame). The patch below fixes the problem by making sure that we don't try to restore any registers from the user-level backing store when invoking the signal handler. The dirty partition of course still gets preserved, as it should be. --david diff -Nru a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S --- a/arch/ia64/kernel/entry.S Mon Feb 10 18:51:22 2003 +++ b/arch/ia64/kernel/entry.S Mon Feb 10 18:51:22 2003 @@ -904,13 +904,14 @@ mov r9=ar.unat mov loc0=rp // save return address mov out0=0 // there is no "oldset" - adds out1=0,sp // out1=&sigscratch + adds out1=8,sp // out1=&sigscratch->ar_pfs (pSys) mov out2=1 // out2=1 => we're in a syscall ;; (pNonSys) mov out2=0 // out2=0 => not a syscall .fframe 16 .spillpsp ar.unat, 16 // (note that offset is relative to psp+0x10!) st8 [sp]=r9,-16 // allocate space for ar.unat and save it + st8 [out1]=loc1,-8 // save ar.pfs, out1=&sigscratch .body br.call.sptk.many rp=do_notify_resume_user .ret15: .restore sp @@ -931,11 +932,12 @@ mov loc0=rp // save return address mov out0=in0 // mask mov out1=in1 // sigsetsize - adds out2=0,sp // out2=&sigscratch + adds out2=8,sp // out2=&sigscratch->ar_pfs ;; .fframe 16 .spillpsp ar.unat, 16 // (note that offset is relative to psp+0x10!) st8 [sp]=r9,-16 // allocate space for ar.unat and save it + st8 [out2]=loc1,-8 // save ar.pfs, out2=&sigscratch .body br.call.sptk.many rp=ia64_rt_sigsuspend .ret17: .restore sp diff -Nru a/arch/ia64/kernel/gate.S b/arch/ia64/kernel/gate.S --- a/arch/ia64/kernel/gate.S Mon Feb 10 18:51:22 2003 +++ b/arch/ia64/kernel/gate.S Mon Feb 10 18:51:22 2003 @@ -145,11 +145,12 @@ */ #define SIGTRAMP_SAVES \ - .unwabi @svr4, 's' // mark this as a sigtramp handler (saves scratch regs) \ - .savesp ar.unat, UNAT_OFF+SIGCONTEXT_OFF \ - .savesp ar.fpsr, FPSR_OFF+SIGCONTEXT_OFF \ - .savesp pr, PR_OFF+SIGCONTEXT_OFF \ - .savesp rp, RP_OFF+SIGCONTEXT_OFF \ + .unwabi @svr4, 's'; /* mark this as a sigtramp handler (saves scratch regs) */ \ + .savesp ar.unat, UNAT_OFF+SIGCONTEXT_OFF; \ + .savesp ar.fpsr, FPSR_OFF+SIGCONTEXT_OFF; \ + .savesp pr, PR_OFF+SIGCONTEXT_OFF; \ + .savesp rp, RP_OFF+SIGCONTEXT_OFF; \ + .savesp ar.pfs, CFM_OFF+SIGCONTEXT_OFF; \ .vframesp SP_OFF+SIGCONTEXT_OFF GLOBAL_ENTRY(ia64_sigtramp) @@ -173,9 +174,7 @@ .spillsp.p p8, ar.rnat, RNAT_OFF+SIGCONTEXT_OFF (p8) br.cond.spnt setup_rbs // yup -> (clobbers r14, r15, and r16) back_from_setup_rbs: - - .spillreg ar.pfs, r8 - alloc r8=ar.pfs,0,0,3,0 // get CFM0, EC0, and CPL0 into r8 + alloc r8=ar.pfs,0,0,3,0 ld8 out0=[base0],16 // load arg0 (signum) adds base1=(ARG1_OFF-(RBS_BASE_OFF+SIGCONTEXT_OFF)),base1 ;; @@ -184,17 +183,12 @@ ;; ld8 out2=[base0] // load arg2 (sigcontextp) ld8 gp=[r17] // get signal handler's global pointer - adds base0=(BSP_OFF+SIGCONTEXT_OFF),sp ;; .spillsp ar.bsp, BSP_OFF+SIGCONTEXT_OFF - st8 [base0]=r9,(CFM_OFF-BSP_OFF) // save sc_ar_bsp - dep r8=0,r8,38,26 // clear EC0, CPL0 and reserved bits - adds base1=(FR6_OFF+16+SIGCONTEXT_OFF),sp - ;; - .spillsp ar.pfs, CFM_OFF+SIGCONTEXT_OFF - st8 [base0]=r8 // save CFM0 + st8 [base0]=r9 // save sc_ar_bsp adds base0=(FR6_OFF+SIGCONTEXT_OFF),sp + adds base1=(FR6_OFF+16+SIGCONTEXT_OFF),sp ;; stf.spill [base0]ö,32 stf.spill [base1]÷,32 @@ -217,7 +211,6 @@ ld8 r15=[base0],(CFM_OFF-BSP_OFF) // fetch sc_ar_bsp and advance to CFM_OFF mov r14=ar.bsp ;; - ld8 r8=[base0] // restore (perhaps modified) CFM0, EC0, and CPL0 cmp.ne p8,p0=r14,r15 // do we need to restore the rbs? (p8) br.cond.spnt restore_rbs // yup -> (clobbers r14-r18, f6 & f7) ;; diff -Nru a/arch/ia64/kernel/sigframe.h b/arch/ia64/kernel/sigframe.h --- a/arch/ia64/kernel/sigframe.h Mon Feb 10 18:51:22 2003 +++ b/arch/ia64/kernel/sigframe.h Mon Feb 10 18:51:22 2003 @@ -1,6 +1,6 @@ struct sigscratch { unsigned long scratch_unat; /* ar.unat for the general registers saved in pt */ - unsigned long pad; + unsigned long ar_pfs; /* for syscalls, the user-level function-state */ struct pt_regs pt; }; diff -Nru a/arch/ia64/kernel/signal.c b/arch/ia64/kernel/signal.c --- a/arch/ia64/kernel/signal.c Mon Feb 10 18:51:22 2003 +++ b/arch/ia64/kernel/signal.c Mon Feb 10 18:51:22 2003 @@ -315,7 +315,7 @@ static long setup_sigcontext (struct sigcontext *sc, sigset_t *mask, struct sigscratch *scr) { - unsigned long flags = 0, ifs, nat; + unsigned long flags = 0, ifs, cfm, nat; long err; ifs = scr->pt.cr_ifs; @@ -325,7 +325,9 @@ if ((ifs & (1UL << 63)) = 0) { /* if cr_ifs isn't valid, we got here through a syscall */ flags |= IA64_SC_FLAG_IN_SYSCALL; - } + cfm = scr->ar_pfs & ((1UL << 38) - 1); + } else + cfm = ifs & ((1UL << 38) - 1); ia64_flush_fph(current); if ((current->thread.flags & IA64_THREAD_FPH_VALID)) { flags |= IA64_SC_FLAG_FPH_VALID; @@ -344,6 +346,7 @@ err |= __put_user(nat, &sc->sc_nat); err |= PUT_SIGSET(mask, &sc->sc_mask); + err |= __put_user(cfm, &sc->sc_cfm); err |= __put_user(scr->pt.cr_ipsr & IA64_PSR_UM, &sc->sc_um); err |= __put_user(scr->pt.ar_rsc, &sc->sc_ar_rsc); err |= __put_user(scr->pt.ar_ccv, &sc->sc_ar_ccv); @@ -422,6 +425,15 @@ scr->pt.ar_fpsr = FPSR_DEFAULT; /* reset fpsr for signal handler */ scr->pt.cr_iip = tramp_addr; ia64_psr(&scr->pt)->ri = 0; /* start executing in first slot */ + /* + * Force the interruption function mask to zero. This has no effect when a + * system-call got interrupted by a signal (since, in that case, scr->pt_cr_ifs is + * ignored), but it has the desirable effect of making it possible to deliver a + * signal with an incomplete register frame (which happens when a mandatory RSE + * load faults). Furthermore, it has no negative effect on the getting the user's + * dirty partition preserved, because that's governed by scr->pt.loadrs. + */ + scr->pt.cr_ifs = (1UL << 63); /* * Note: this affects only the NaT bits of the scratch regs (the ones saved in ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (6 preceding siblings ...) 2003-02-11 3:06 ` David Mosberger @ 2003-02-11 7:26 ` Matt Chapman 2003-02-11 19:11 ` David Mosberger 2003-03-07 23:35 ` Bjorn Helgaas 9 siblings, 0 replies; 11+ messages in thread From: Matt Chapman @ 2003-02-11 7:26 UTC (permalink / raw) To: linux-ia64 On Mon, Feb 10, 2003 at 07:06:54PM -0800, David Mosberger wrote: > > OK, I looked into this. The patch below should fix the problem. Thanks! That does the trick. Matt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (7 preceding siblings ...) 2003-02-11 7:26 ` Matt Chapman @ 2003-02-11 19:11 ` David Mosberger 2003-03-07 23:35 ` Bjorn Helgaas 9 siblings, 0 replies; 11+ messages in thread From: David Mosberger @ 2003-02-11 19:11 UTC (permalink / raw) To: linux-ia64 >>>>> On Tue, 11 Feb 2003 18:26:57 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: Matt> On Mon, Feb 10, 2003 at 07:06:54PM -0800, David Mosberger Matt> wrote: >> OK, I looked into this. The patch below should fix the problem. Matt> Thanks! That does the trick. Well, thanks for the nice test case. That always helps a great deal. --david ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Linux-ia64] sigaltstack and RBS 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman ` (8 preceding siblings ...) 2003-02-11 19:11 ` David Mosberger @ 2003-03-07 23:35 ` Bjorn Helgaas 9 siblings, 0 replies; 11+ messages in thread From: Bjorn Helgaas @ 2003-03-07 23:35 UTC (permalink / raw) To: linux-ia64 On Monday 10 February 2003 8:06 pm, David Mosberger wrote: > >>>>> On Sun, 9 Feb 2003 16:19:20 +1100, Matt Chapman <matthewc@cse.unsw.edu.au> said: > > Matt> I'm having some difficulty "demand paging" register backing > Matt> store from userspace (i.e. using SIGSEGV to map pages in on > Matt> demand). > > OK, I looked into this. The patch below should fix the problem. > Bjorn, you might want to consider applying the same patch for 2.4.xx > (with some extra testing). I applied this patch to 2.4. Bjorn ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2003-03-07 23:35 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-02-09 5:19 [Linux-ia64] sigaltstack and RBS Matt Chapman 2003-02-09 5:27 ` David Mosberger 2003-02-09 5:47 ` Matt Chapman 2003-02-09 7:58 ` Matt Chapman 2003-02-09 8:48 ` David Mosberger 2003-02-09 10:55 ` Matt Chapman 2003-02-09 18:22 ` David Mosberger 2003-02-11 3:06 ` David Mosberger 2003-02-11 7:26 ` Matt Chapman 2003-02-11 19:11 ` David Mosberger 2003-03-07 23:35 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox