* Sigcontext->sc_pc Passed to User
@ 2002-07-11 9:08 ` Kevin D. Kissell
0 siblings, 0 replies; 18+ messages in thread
From: Kevin D. Kissell @ 2002-07-11 9:08 UTC (permalink / raw)
To: linux-mips
In responding to an enquiry from one of MIPS' third-party
software vendors, I noted something that seems a little
broken to me in the current (and maybe all historical)
MIPS/Linux kernels. Please forgive me for opening
old wounds if this has been beaten to death in the past.
When a user catches a signal, such as SIGBUS, the
signal "payload" includes a pointer to a sigcontext
structure on the stack, containing the state of the
CPU when the exception associated with the signal
occurred. But not exactly. We seem to consistently
call compute_return_epc() before send_sig() or
force_sig(). This results in the user being passed
an indication of the faulting PC that is one instruction
past the true location. That would be no problem,
except that the faulting instruction may have been
in a branch delay slot, such that there is no practical
and reliable way for the signal handler to determine
which instruction failed on the basis of the sigcontext
data.
It is, of course, important that execution resume
at the instruction following any instruction generating
an exception/signal. But that's not the same thing
as saying that the sigcontext should report the resumption
EPC instead of the faulting EPC. There are various
ways of dealing with this, but before going into any
of them, I'm curious as to whether this has been
discussed before, and whether anyone thinks that
things really should be the way they are.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 18+ messages in thread* Sigcontext->sc_pc Passed to User @ 2002-07-11 9:08 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-11 9:08 UTC (permalink / raw) To: linux-mips In responding to an enquiry from one of MIPS' third-party software vendors, I noted something that seems a little broken to me in the current (and maybe all historical) MIPS/Linux kernels. Please forgive me for opening old wounds if this has been beaten to death in the past. When a user catches a signal, such as SIGBUS, the signal "payload" includes a pointer to a sigcontext structure on the stack, containing the state of the CPU when the exception associated with the signal occurred. But not exactly. We seem to consistently call compute_return_epc() before send_sig() or force_sig(). This results in the user being passed an indication of the faulting PC that is one instruction past the true location. That would be no problem, except that the faulting instruction may have been in a branch delay slot, such that there is no practical and reliable way for the signal handler to determine which instruction failed on the basis of the sigcontext data. It is, of course, important that execution resume at the instruction following any instruction generating an exception/signal. But that's not the same thing as saying that the sigcontext should report the resumption EPC instead of the faulting EPC. There are various ways of dealing with this, but before going into any of them, I'm curious as to whether this has been discussed before, and whether anyone thinks that things really should be the way they are. Regards, Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-11 9:08 ` Kevin D. Kissell (?) @ 2002-07-11 13:17 ` Maciej W. Rozycki 2002-07-11 15:16 ` Kevin D. Kissell -1 siblings, 1 reply; 18+ messages in thread From: Maciej W. Rozycki @ 2002-07-11 13:17 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: linux-mips On Thu, 11 Jul 2002, Kevin D. Kissell wrote: > In responding to an enquiry from one of MIPS' third-party > software vendors, I noted something that seems a little > broken to me in the current (and maybe all historical) > MIPS/Linux kernels. Please forgive me for opening > old wounds if this has been beaten to death in the past. :-/ > When a user catches a signal, such as SIGBUS, the > signal "payload" includes a pointer to a sigcontext > structure on the stack, containing the state of the > CPU when the exception associated with the signal > occurred. But not exactly. We seem to consistently > call compute_return_epc() before send_sig() or > force_sig(). This results in the user being passed > an indication of the faulting PC that is one instruction > past the true location. That would be no problem, > except that the faulting instruction may have been > in a branch delay slot, such that there is no practical > and reliable way for the signal handler to determine > which instruction failed on the basis of the sigcontext > data. That needs to be done globally, once and forever for all kinds of signals passed to a program. I have partial fixes that I am using privately already, but a complete solution is on my to-do list. > It is, of course, important that execution resume > at the instruction following any instruction generating > an exception/signal. But that's not the same thing > as saying that the sigcontext should report the resumption > EPC instead of the faulting EPC. There are various > ways of dealing with this, but before going into any > of them, I'm curious as to whether this has been > discussed before, and whether anyone thinks that > things really should be the way they are. I believe the resumption should happen with EPC unmodified. A handler may set EPC differently if it wants (possibly with longjmp() or by interpreting code at EPC and modifying EPC appropriately). For the three signal handling possibilities, I'd do that as follows (assuming SIGBUS, SIGSEGV, etc. lethal signals): - SIG_IGN: return to EPC with no action. A program will loop indefinitely, but if that's what a user wants... - SIG_DFL: kill. - HANDLER: call a handler with the signal context unmodified and let the user code decide what to do. Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-11 15:16 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-11 15:16 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-mips From: "Maciej W. Rozycki" <macro@ds2.pg.gda.pl> [snip] > I believe the resumption should happen with EPC unmodified. A handler > may set EPC differently if it wants (possibly with longjmp() or by > interpreting code at EPC and modifying EPC appropriately). For the three > signal handling possibilities, I'd do that as follows (assuming SIGBUS, > SIGSEGV, etc. lethal signals): > > - SIG_IGN: return to EPC with no action. A program will loop > indefinitely, but if that's what a user wants... I don't think that this is the right thing to do, philosophically. Hanging in an infinite loop and making no forward progress is not, to me "ignoring" an event. The old X/Open specs I've got say that SIGFPE, SIGILL, and SIGSEGV behavior is undefined if bound to SIG_IGN (curiously, they don't call out SIGBUS), but I think that in practical terms we need to provide whatever behavior people expect from Linux on x86 and PPC. What happens on those platforms? A quick look at the x86 kernel code makes me think that they do, indeed, do the "wrong" thing and beat their heads against the ignored event for all eternity, but I'm insufficiently an expert in x86 trap semantics to know for certain whether that's the case. If it is, right or wrong, that's what we ought to do. > - SIG_DFL: kill. > > - HANDLER: call a handler with the signal context unmodified and let the > user code decide what to do. Independently of what we do for the SIG_IGN cases, this is important, and the user code cannot decide what to do if it cannot know what instruction caused the fault. Fixups on SIGFPE must be able to find the FP instruction, which is not currently possible if it was in a branch delay slot. Similarly, user-mode emulation of "memory" via signal handlers cannot work unless the loads and stores can be identified. But, having "done the deed", return from the signal handler should resume at the instruction *following* the one generating the fault, and not replay the same instruction. We *could* punt that to the signal handler, but making every signal package carry its own copy of compute_return_epc() to handle the branch delay slot cases strikes me as being unfriendly to the user and is arguably slightly less reliable. I guess I'd like things to be rigged so that the sigcontext structure contains the address of the faulting instruction as the sc_pc, but where the return from signal goes to the address calculated by compute_return_epc(). But again, what do people expect in the "mainstream" world of x86 Linux? Regards, Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-11 15:16 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-11 15:16 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: linux-mips From: "Maciej W. Rozycki" <macro@ds2.pg.gda.pl> [snip] > I believe the resumption should happen with EPC unmodified. A handler > may set EPC differently if it wants (possibly with longjmp() or by > interpreting code at EPC and modifying EPC appropriately). For the three > signal handling possibilities, I'd do that as follows (assuming SIGBUS, > SIGSEGV, etc. lethal signals): > > - SIG_IGN: return to EPC with no action. A program will loop > indefinitely, but if that's what a user wants... I don't think that this is the right thing to do, philosophically. Hanging in an infinite loop and making no forward progress is not, to me "ignoring" an event. The old X/Open specs I've got say that SIGFPE, SIGILL, and SIGSEGV behavior is undefined if bound to SIG_IGN (curiously, they don't call out SIGBUS), but I think that in practical terms we need to provide whatever behavior people expect from Linux on x86 and PPC. What happens on those platforms? A quick look at the x86 kernel code makes me think that they do, indeed, do the "wrong" thing and beat their heads against the ignored event for all eternity, but I'm insufficiently an expert in x86 trap semantics to know for certain whether that's the case. If it is, right or wrong, that's what we ought to do. > - SIG_DFL: kill. > > - HANDLER: call a handler with the signal context unmodified and let the > user code decide what to do. Independently of what we do for the SIG_IGN cases, this is important, and the user code cannot decide what to do if it cannot know what instruction caused the fault. Fixups on SIGFPE must be able to find the FP instruction, which is not currently possible if it was in a branch delay slot. Similarly, user-mode emulation of "memory" via signal handlers cannot work unless the loads and stores can be identified. But, having "done the deed", return from the signal handler should resume at the instruction *following* the one generating the fault, and not replay the same instruction. We *could* punt that to the signal handler, but making every signal package carry its own copy of compute_return_epc() to handle the branch delay slot cases strikes me as being unfriendly to the user and is arguably slightly less reliable. I guess I'd like things to be rigged so that the sigcontext structure contains the address of the faulting instruction as the sc_pc, but where the return from signal goes to the address calculated by compute_return_epc(). But again, what do people expect in the "mainstream" world of x86 Linux? Regards, Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-11 15:16 ` Kevin D. Kissell (?) @ 2002-07-11 16:52 ` Maciej W. Rozycki -1 siblings, 0 replies; 18+ messages in thread From: Maciej W. Rozycki @ 2002-07-11 16:52 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: linux-mips On Thu, 11 Jul 2002, Kevin D. Kissell wrote: > > - SIG_IGN: return to EPC with no action. A program will loop > > indefinitely, but if that's what a user wants... > > I don't think that this is the right thing to do, philosophically. > Hanging in an infinite loop and making no forward progress > is not, to me "ignoring" an event. The old X/Open specs I've > got say that SIGFPE, SIGILL, and SIGSEGV behavior is > undefined if bound to SIG_IGN (curiously, they don't call > out SIGBUS), but I think that in practical terms we need to > provide whatever behavior people expect from Linux on > x86 and PPC. What happens on those platforms? A > quick look at the x86 kernel code makes me think that > they do, indeed, do the "wrong" thing and beat their > heads against the ignored event for all eternity, but I'm > insufficiently an expert in x86 trap semantics to know > for certain whether that's the case. If it is, right or > wrong, that's what we ought to do. Yes, they loop indefinitely. That my be useful for debugging -- you may attach to a running program and you'll be sure to get at the faulting instruction. Otherwise the warning from the libc manual applies: "If you block or ignore these signals or establish handlers for them that return normally, your program will probably break horribly when such signals happen, unless they are generated by `raise' or `kill' instead of a real error." So a user (programmer) has been warned. > > - HANDLER: call a handler with the signal context unmodified and let the > > user code decide what to do. > > Independently of what we do for the SIG_IGN cases, > this is important, and the user code cannot decide what > to do if it cannot know what instruction caused the fault. > Fixups on SIGFPE must be able to find the FP instruction, > which is not currently possible if it was in a branch delay > slot. Similarly, user-mode emulation of "memory" via Well, the Cause register is passed to the userland, so only EPC needs to be fixed. > signal handlers cannot work unless the loads and stores > can be identified. But, having "done the deed", return > from the signal handler should resume at the instruction > *following* the one generating the fault, and not replay > the same instruction. We *could* punt that to the signal > handler, but making every signal package carry its own > copy of compute_return_epc() to handle the branch > delay slot cases strikes me as being unfriendly to the > user and is arguably slightly less reliable. I guess I'd like things > to be rigged so that the sigcontext structure contains the address > of the faulting instruction as the sc_pc, but where the return > from signal goes to the address calculated by > compute_return_epc(). But again, what do people expect > in the "mainstream" world of x86 Linux? ;-) FPE faults on the x87 fault before the *following* FP instruction (which is a regular one or the special "wait" one). The context of the faulting instruction (both the instruction and data addresses and the opcode) is saved in special registers (as usually with i386, the most complex way was chosen) and can be retrieved by dumping the FPU context to memory (see the "fnstenv" and "fnsave" instructions). So the i386 is very different and can't really be used as a reference. However, a brief look at the Alpha port (which is mature and also the Alpha CPU is much similar to MIPS) reveals the code never modifies the saved PC in the kernel. But again, the FPU traps happen after faulting instructions (for older models even imprecisely -- see the search back code in alpha_fp_emul_imprecise()). With current specifications I think the best way for the SIGFPE handler (since it's somewhat special) would be to provide the address of the faulting instruction in siginfo_t.si_addr and have the EPC in sigcontext set up for a continuation (that would still allow longjmp(), etc.). Ideally, I'd see it reversely, i.e. EPC unchanged and siginfo_t.si_addr containing an address to continue, so that a handler would have to explicitly copy the address to EPC if it decided it handled the signal successfully (so that a program doesn't continue unpredictably after an integer division by zero, because the handler expected only real FP faults) -- maybe we should extend siginfo_t? For other exceptions, I'd just leave EPC alone. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@ds2.pg.gda.pl, PGP key available + ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-11 9:08 ` Kevin D. Kissell (?) (?) @ 2002-07-12 1:40 ` Ralf Baechle 2002-07-12 8:00 ` Kevin D. Kissell -1 siblings, 1 reply; 18+ messages in thread From: Ralf Baechle @ 2002-07-12 1:40 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: linux-mips On Thu, Jul 11, 2002 at 11:08:21AM +0200, Kevin D. Kissell wrote: > In responding to an enquiry from one of MIPS' third-party > software vendors, I noted something that seems a little > broken to me in the current (and maybe all historical) > MIPS/Linux kernels. Please forgive me for opening > old wounds if this has been beaten to death in the past. > > When a user catches a signal, such as SIGBUS, the > signal "payload" includes a pointer to a sigcontext > structure on the stack, containing the state of the > CPU when the exception associated with the signal > occurred. But not exactly. We seem to consistently > call compute_return_epc() before send_sig() or > force_sig(). This results in the user being passed > an indication of the faulting PC that is one instruction > past the true location. That would be no problem, > except that the faulting instruction may have been > in a branch delay slot, such that there is no practical > and reliable way for the signal handler to determine > which instruction failed on the basis of the sigcontext > data. > > It is, of course, important that execution resume > at the instruction following any instruction generating > an exception/signal. But that's not the same thing > as saying that the sigcontext should report the resumption > EPC instead of the faulting EPC. There are various > ways of dealing with this, but before going into any > of them, I'm curious as to whether this has been > discussed before, and whether anyone thinks that > things really should be the way they are. Our signal stackframe is almost the same as on IRIX5 which is what some software expects. Maybe time to checkout what IRIX does ... Ralf ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 8:00 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-12 8:00 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips From: "Ralf Baechle" <ralf@oss.sgi.com> > On Thu, Jul 11, 2002 at 11:08:21AM +0200, Kevin D. Kissell wrote: [snip] > > When a user catches a signal, such as SIGBUS, the > > signal "payload" includes a pointer to a sigcontext > > structure on the stack, containing the state of the > > CPU when the exception associated with the signal > > occurred. But not exactly. We seem to consistently > > call compute_return_epc() before send_sig() or > > force_sig(). This results in the user being passed > > an indication of the faulting PC that is one instruction > > past the true location. That would be no problem, > > except that the faulting instruction may have been > > in a branch delay slot, such that there is no practical > > and reliable way for the signal handler to determine > > which instruction failed on the basis of the sigcontext > > data. > > > > It is, of course, important that execution resume > > at the instruction following any instruction generating > > an exception/signal. But that's not the same thing > > as saying that the sigcontext should report the resumption > > EPC instead of the faulting EPC. There are various > > ways of dealing with this, but before going into any > > of them, I'm curious as to whether this has been > > discussed before, and whether anyone thinks that > > things really should be the way they are. > > Our signal stackframe is almost the same as on IRIX5 which is what > some software expects. Maybe time to checkout what IRIX does ... The IRIX team made some stunningly bad design decisions over the years, my favorite being "virtual swap space" and its side effect of deliberately killing system daemons at random under load. A signal scheme such as we have now in MIPS/Linux, where a user program *cannot* identify the instruction causing a signal if that instruction was in the delay slot of a taken branch, is broken from first principles. Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 8:00 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-12 8:00 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips From: "Ralf Baechle" <ralf@oss.sgi.com> > On Thu, Jul 11, 2002 at 11:08:21AM +0200, Kevin D. Kissell wrote: [snip] > > When a user catches a signal, such as SIGBUS, the > > signal "payload" includes a pointer to a sigcontext > > structure on the stack, containing the state of the > > CPU when the exception associated with the signal > > occurred. But not exactly. We seem to consistently > > call compute_return_epc() before send_sig() or > > force_sig(). This results in the user being passed > > an indication of the faulting PC that is one instruction > > past the true location. That would be no problem, > > except that the faulting instruction may have been > > in a branch delay slot, such that there is no practical > > and reliable way for the signal handler to determine > > which instruction failed on the basis of the sigcontext > > data. > > > > It is, of course, important that execution resume > > at the instruction following any instruction generating > > an exception/signal. But that's not the same thing > > as saying that the sigcontext should report the resumption > > EPC instead of the faulting EPC. There are various > > ways of dealing with this, but before going into any > > of them, I'm curious as to whether this has been > > discussed before, and whether anyone thinks that > > things really should be the way they are. > > Our signal stackframe is almost the same as on IRIX5 which is what > some software expects. Maybe time to checkout what IRIX does ... The IRIX team made some stunningly bad design decisions over the years, my favorite being "virtual swap space" and its side effect of deliberately killing system daemons at random under load. A signal scheme such as we have now in MIPS/Linux, where a user program *cannot* identify the instruction causing a signal if that instruction was in the delay slot of a taken branch, is broken from first principles. Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-12 8:00 ` Kevin D. Kissell (?) @ 2002-07-12 10:00 ` Ralf Baechle 2002-07-12 11:49 ` Kevin D. Kissell 2002-07-12 13:01 ` Alan Cox -1 siblings, 2 replies; 18+ messages in thread From: Ralf Baechle @ 2002-07-12 10:00 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: linux-mips On Fri, Jul 12, 2002 at 10:00:27AM +0200, Kevin D. Kissell wrote: > The IRIX team made some stunningly bad design > decisions over the years, my favorite being "virtual > swap space" and its side effect of deliberately killing > system daemons at random under load. A signal scheme > such as we have now in MIPS/Linux, where a user program > *cannot* identify the instruction causing a signal if > that instruction was in the delay slot of a taken branch, > is broken from first principles. Certainly you're right when you say a signal handler show know which instruction was causing a fault. Ours is simply a too bad implementation of their interface ... IRIX virtual swap space is simply memory overcommit. Linux has that too and it's been subject to frequent religious discussions on Linux kernel. Non-overcommit means large amounts of memory are required when forking of a new process. The standard example is a fat bloated Mozilla forking for printing. Non-overcommit means you need those 50 or 100 megs of Mozilla process size once more and if not as physical memory then at least as swap space. Deciede yourself if you're paranoid and want that operation to only succeed if that much memory is actually available or if you take the risk of the fork & exec operation failing the other way. Ralf ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 11:49 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-12 11:49 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips From: "Ralf Baechle" <ralf@oss.sgi.com> > On Fri, Jul 12, 2002 at 10:00:27AM +0200, Kevin D. Kissell wrote: > > > The IRIX team made some stunningly bad design > > decisions over the years, my favorite being "virtual > > swap space" and its side effect of deliberately killing > > system daemons at random under load. A signal scheme > > such as we have now in MIPS/Linux, where a user program > > *cannot* identify the instruction causing a signal if > > that instruction was in the delay slot of a taken branch, > > is broken from first principles. > > Certainly you're right when you say a signal handler show know which > instruction was causing a fault. Ours is simply a too bad implementation > of their interface ... > > IRIX virtual swap space is simply memory overcommit. Linux has that too > and it's been subject to frequent religious discussions on Linux kernel. > Non-overcommit means large amounts of memory are required when forking > of a new process. The standard example is a fat bloated Mozilla forking > for printing. Non-overcommit means you need those 50 or 100 megs of > Mozilla process size once more and if not as physical memory then at > least as swap space. Deciede yourself if you're paranoid and want that > operation to only succeed if that much memory is actually available or > if you take the risk of the fork & exec operation failing the other way. Whenever it's been my design responsibility, I made forks fail if there wasn't enough backing store to handle the process. Frankly, there are limits to the degree to which an OS should compromise its integrity for the sake of supporting badly concieved applications, be they Mozilla or the SGI integrated CAD environment. But even if you prefer to take the "speculative" or "optimistic" model for handling the situation, what IRIX did was insane: When, after having allowed too many unsupportable forks to succeed, they detected deadlock in the swap system, they killed processes *at random*. Including system daemons. At a *minimum*, a system should only terminate processes belonging to the user (and preferably the process group) who has been granted speculative fork success. Anything else is a massive "breach of contract" for a multiuser OS. IMHO, if someone really wanted to fix this in the OS, we'd get beyond the traditional Unix "fork" model. And if someone really wanted to avoid the problem in Mozilla or an IDE, one would have all subprograms launched by a tiny "launcher", who would recieve instructions and data via some form of IPC, fork itself, and exec as appropriate. But this is getting a bit off the topic. Is anyone aware of any IRIX applications ported to Linux that would break if we corrected the signal payload semantics? Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 11:49 ` Kevin D. Kissell 0 siblings, 0 replies; 18+ messages in thread From: Kevin D. Kissell @ 2002-07-12 11:49 UTC (permalink / raw) To: Ralf Baechle; +Cc: linux-mips From: "Ralf Baechle" <ralf@oss.sgi.com> > On Fri, Jul 12, 2002 at 10:00:27AM +0200, Kevin D. Kissell wrote: > > > The IRIX team made some stunningly bad design > > decisions over the years, my favorite being "virtual > > swap space" and its side effect of deliberately killing > > system daemons at random under load. A signal scheme > > such as we have now in MIPS/Linux, where a user program > > *cannot* identify the instruction causing a signal if > > that instruction was in the delay slot of a taken branch, > > is broken from first principles. > > Certainly you're right when you say a signal handler show know which > instruction was causing a fault. Ours is simply a too bad implementation > of their interface ... > > IRIX virtual swap space is simply memory overcommit. Linux has that too > and it's been subject to frequent religious discussions on Linux kernel. > Non-overcommit means large amounts of memory are required when forking > of a new process. The standard example is a fat bloated Mozilla forking > for printing. Non-overcommit means you need those 50 or 100 megs of > Mozilla process size once more and if not as physical memory then at > least as swap space. Deciede yourself if you're paranoid and want that > operation to only succeed if that much memory is actually available or > if you take the risk of the fork & exec operation failing the other way. Whenever it's been my design responsibility, I made forks fail if there wasn't enough backing store to handle the process. Frankly, there are limits to the degree to which an OS should compromise its integrity for the sake of supporting badly concieved applications, be they Mozilla or the SGI integrated CAD environment. But even if you prefer to take the "speculative" or "optimistic" model for handling the situation, what IRIX did was insane: When, after having allowed too many unsupportable forks to succeed, they detected deadlock in the swap system, they killed processes *at random*. Including system daemons. At a *minimum*, a system should only terminate processes belonging to the user (and preferably the process group) who has been granted speculative fork success. Anything else is a massive "breach of contract" for a multiuser OS. IMHO, if someone really wanted to fix this in the OS, we'd get beyond the traditional Unix "fork" model. And if someone really wanted to avoid the problem in Mozilla or an IDE, one would have all subprograms launched by a tiny "launcher", who would recieve instructions and data via some form of IPC, fork itself, and exec as appropriate. But this is getting a bit off the topic. Is anyone aware of any IRIX applications ported to Linux that would break if we corrected the signal payload semantics? Kevin K. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-12 11:49 ` Kevin D. Kissell (?) @ 2002-07-12 15:29 ` Ralf Baechle -1 siblings, 0 replies; 18+ messages in thread From: Ralf Baechle @ 2002-07-12 15:29 UTC (permalink / raw) To: Kevin D. Kissell; +Cc: linux-mips On Fri, Jul 12, 2002 at 01:49:15PM +0200, Kevin D. Kissell wrote: > Whenever it's been my design responsibility, I made forks fail if > there wasn't enough backing store to handle the process. Frankly, > there are limits to the degree to which an OS should compromise > its integrity for the sake of supporting badly concieved applications, > be they Mozilla or the SGI integrated CAD environment. But > even if you prefer to take the "speculative" or "optimistic" model > for handling the situation, what IRIX did was insane: When, after > having allowed too many unsupportable forks to succeed, they > detected deadlock in the swap system, they killed processes > *at random*. Including system daemons. At a *minimum*, > a system should only terminate processes belonging to the > user (and preferably the process group) who has been granted > speculative fork success. Anything else is a massive "breach of > contract" for a multiuser OS. See linux/mm/oom_kill.c:oom_kill() ... > IMHO, if someone really wanted to fix this in the OS, > we'd get beyond the traditional Unix "fork" model. > And if someone really wanted to avoid the problem in Mozilla or > an IDE, one would have all subprograms launched by a tiny > "launcher", who would recieve instructions and data via some > form of IPC, fork itself, and exec as appropriate. That or more Linux specific a clone/vfork & exec approach. > But this is getting a bit off the topic. Is anyone aware of any > IRIX applications ported to Linux that would break if we > corrected the signal payload semantics? As I said we even missimplemented the IRIX semantics. In IRIX the sc_pc field of the frame is pointing to the instruction that was causing the signal while we try to skip over it - with all the side effects that we're just discussing. I tried that for both trap and break instructions. So I suggest we simply remove the compute_return_epc() calls from do_bp and do_trap. I haven't tested this but I'd assume this would also be the behaviour that gdb is expecting. So that would follow the example given by Linux/i386 and IRIX and should your ISV's problem. What more could we ask for. I still have to look over the other exceptions that may call compute_return_epc() but it seems we should do the same thing for all of them and not call compute_return_epc if we're going to send a signal. Ralf ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 13:01 ` Alan Cox 0 siblings, 0 replies; 18+ messages in thread From: Alan Cox @ 2002-07-12 13:01 UTC (permalink / raw) To: Ralf Baechle; +Cc: Kevin D. Kissell, linux-mips > Non-overcommit means large amounts of memory are required when forking > of a new process. The standard example is a fat bloated Mozilla forking > for printing. Non-overcommit means you need those 50 or 100 megs of > Mozilla process size once more and if not as physical memory then at > least as swap space. Deciede yourself if you're paranoid and want that > operation to only succeed if that much memory is actually available or > if you take the risk of the fork & exec operation failing the other way. Your numbers are ridiculously off. A mozilla instance on x86 commits 17Mb of potentially swap backed memory when viewing the mozilla 1.0 start page. (Its actually a bit less but there is delay in the garbage collector) 2.4.18/19-ac support non overcommit, and its rather useful Alan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 13:01 ` Alan Cox 0 siblings, 0 replies; 18+ messages in thread From: Alan Cox @ 2002-07-12 13:01 UTC (permalink / raw) To: Ralf Baechle; +Cc: Kevin D. Kissell, linux-mips > Non-overcommit means large amounts of memory are required when forking > of a new process. The standard example is a fat bloated Mozilla forking > for printing. Non-overcommit means you need those 50 or 100 megs of > Mozilla process size once more and if not as physical memory then at > least as swap space. Deciede yourself if you're paranoid and want that > operation to only succeed if that much memory is actually available or > if you take the risk of the fork & exec operation failing the other way. Your numbers are ridiculously off. A mozilla instance on x86 commits 17Mb of potentially swap backed memory when viewing the mozilla 1.0 start page. (Its actually a bit less but there is delay in the garbage collector) 2.4.18/19-ac support non overcommit, and its rather useful Alan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User 2002-07-12 13:01 ` Alan Cox (?) @ 2002-07-12 14:23 ` Ralf Baechle 2002-07-12 15:36 ` Alan Cox -1 siblings, 1 reply; 18+ messages in thread From: Ralf Baechle @ 2002-07-12 14:23 UTC (permalink / raw) To: Alan Cox; +Cc: Kevin D. Kissell, linux-mips On Fri, Jul 12, 2002 at 02:01:56PM +0100, Alan Cox wrote: > > Non-overcommit means large amounts of memory are required when forking > > of a new process. The standard example is a fat bloated Mozilla forking > > for printing. Non-overcommit means you need those 50 or 100 megs of > > Mozilla process size once more and if not as physical memory then at > > least as swap space. Deciede yourself if you're paranoid and want that > > operation to only succeed if that much memory is actually available or > > if you take the risk of the fork & exec operation failing the other way. > > Your numbers are ridiculously off. > > A mozilla instance on x86 commits 17Mb of potentially swap backed memory > when viewing the mozilla 1.0 start page. (Its actually a bit less but there > is delay in the garbage collector) These were typical numbers of the last Mozilla I hacked myself on MIPS. It can grow larger without doing alot. Aside of that this isn't Mozilla specific; any arbitrary program that does some fork & exec thing and it's memory size could be choosen. > 2.4.18/19-ac support non overcommit, and its rather useful No doubt about that. I just say non overcommit has been subject to long discussions and as usually in such religious discussions both sides had valid arguments. I leave it to everybody to choose his / her own poison. Ralf ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 15:36 ` Alan Cox 0 siblings, 0 replies; 18+ messages in thread From: Alan Cox @ 2002-07-12 15:36 UTC (permalink / raw) To: Ralf Baechle; +Cc: Alan Cox, Kevin D. Kissell, linux-mips > > A mozilla instance on x86 commits 17Mb of potentially swap backed memory > > when viewing the mozilla 1.0 start page. (Its actually a bit less but there > > is delay in the garbage collector) > > These were typical numbers of the last Mozilla I hacked myself on MIPS. > It can grow larger without doing alot. Aside of that this isn't Mozilla > specific; any arbitrary program that does some fork & exec thing and > it's memory size could be choosen. These are precise page accurate measurements from the real world. What most people forget is that very little of an ELF application is actually swap backed as opposed to file backed read only ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sigcontext->sc_pc Passed to User @ 2002-07-12 15:36 ` Alan Cox 0 siblings, 0 replies; 18+ messages in thread From: Alan Cox @ 2002-07-12 15:36 UTC (permalink / raw) To: Ralf Baechle; +Cc: Alan Cox, Kevin D. Kissell, linux-mips > > A mozilla instance on x86 commits 17Mb of potentially swap backed memory > > when viewing the mozilla 1.0 start page. (Its actually a bit less but there > > is delay in the garbage collector) > > These were typical numbers of the last Mozilla I hacked myself on MIPS. > It can grow larger without doing alot. Aside of that this isn't Mozilla > specific; any arbitrary program that does some fork & exec thing and > it's memory size could be choosen. These are precise page accurate measurements from the real world. What most people forget is that very little of an ELF application is actually swap backed as opposed to file backed read only ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2002-07-12 15:36 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-07-11 9:08 Sigcontext->sc_pc Passed to User Kevin D. Kissell 2002-07-11 9:08 ` Kevin D. Kissell 2002-07-11 13:17 ` Maciej W. Rozycki 2002-07-11 15:16 ` Kevin D. Kissell 2002-07-11 15:16 ` Kevin D. Kissell 2002-07-11 16:52 ` Maciej W. Rozycki 2002-07-12 1:40 ` Ralf Baechle 2002-07-12 8:00 ` Kevin D. Kissell 2002-07-12 8:00 ` Kevin D. Kissell 2002-07-12 10:00 ` Ralf Baechle 2002-07-12 11:49 ` Kevin D. Kissell 2002-07-12 11:49 ` Kevin D. Kissell 2002-07-12 15:29 ` Ralf Baechle 2002-07-12 13:01 ` Alan Cox 2002-07-12 13:01 ` Alan Cox 2002-07-12 14:23 ` Ralf Baechle 2002-07-12 15:36 ` Alan Cox 2002-07-12 15:36 ` Alan Cox
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.