* [PATCH 1/1] s390: actually restart syscalls after sys_restart
@ 2010-01-21 6:28 Serge E. Hallyn
[not found] ` <20100121062802.GA2770-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2010-01-21 6:28 UTC (permalink / raw)
To: Oren Laadan; +Cc: Linux Containers
On x86, do_signal() leaves -516 in eax while it freezes, which
sys_restart() can use to detect that it should restart the
syscall which was interrupted by a signal (or the freezer).
On s390, gprs[2] gets tweaked to -EINTR (-4) instead, leaving
us no reliable way to tell whether should be restarted.
Define TIF_RESTARTBLOCK as a thread flag showing that the
thread expects to be frozen while kicked out of a restartable
syscall by a signal.
This is needed so that, if it is checkpointed and restarted,
the restarted task has a way to tell that, upon completion
of sys_restart(), it should restart the interrupted syscall.
Without this patch, restart of the program
close(0); close(1); close(2);
sleep(30);
immediately exits. With the patch, it continues to sleep for
the remaining sleep time.
Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
arch/s390/include/asm/checkpoint_hdr.h | 1 +
arch/s390/include/asm/thread_info.h | 2 ++
arch/s390/kernel/signal.c | 3 +++
arch/s390/mm/checkpoint.c | 16 ++++++++++++++++
4 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/arch/s390/include/asm/checkpoint_hdr.h b/arch/s390/include/asm/checkpoint_hdr.h
index bc9f624..4ef14e8 100644
--- a/arch/s390/include/asm/checkpoint_hdr.h
+++ b/arch/s390/include/asm/checkpoint_hdr.h
@@ -73,6 +73,7 @@ struct ckpt_hdr_cpu {
__u8 access_id;
__u8 single_step;
__u8 instruction_fetch;
+ __u8 should_restart;
};
struct ckpt_hdr_mm_context {
diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index 07eb61b..2fac866 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_MEMDIE 19
#define TIF_RESTORE_SIGMASK 20 /* restore signal mask in do_signal() */
#define TIF_FREEZE 21 /* thread is freezing for suspend */
+#define TIF_RESTARTBLOCK 23 /* for checkpoint */
#define _TIF_NOTIFY_RESUME (1<<TIF_NOTIFY_RESUME)
#define _TIF_RESTORE_SIGMASK (1<<TIF_RESTORE_SIGMASK)
@@ -116,6 +117,7 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
#define _TIF_31BIT (1<<TIF_31BIT)
#define _TIF_FREEZE (1<<TIF_FREEZE)
+#define _TIF_RESTARTBLOCK (1<<TIF_RESTARTBLOCK)
#endif /* __KERNEL__ */
diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
index 6b4fef8..1179b19 100644
--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -458,6 +458,7 @@ void do_signal(struct pt_regs *regs)
regs->psw.addr = restart_addr;
break;
case -ERESTART_RESTARTBLOCK:
+ set_thread_flag(TIF_RESTARTBLOCK);
regs->gprs[2] = -EINTR;
}
regs->svcnr = 0; /* Don't deal with this again. */
@@ -467,6 +468,8 @@ void do_signal(struct pt_regs *regs)
the debugger may change all our registers ... */
signr = get_signal_to_deliver(&info, &ka, regs, NULL);
+ clear_thread_flag(TIF_RESTARTBLOCK);
+
/* Depending on the signal settings we may need to revert the
decision to restart the system call. */
if (signr > 0 && regs->psw.addr == restart_addr) {
diff --git a/arch/s390/mm/checkpoint.c b/arch/s390/mm/checkpoint.c
index 40dd417..d8d4b6b 100644
--- a/arch/s390/mm/checkpoint.c
+++ b/arch/s390/mm/checkpoint.c
@@ -65,6 +65,16 @@ static void s390_copy_regs(int op, struct ckpt_hdr_cpu *h,
BUG_ON(h->gprs[2] < 0);
h->gprs[2] = 0;
}
+ /*
+ * if the checkpointed task was frozen in a syscall with
+ * -ERESTART_RESTARTBLOCK (switched to -EINTR during do_signal()
+ * before try_to_freeze() happened) * then after restart we need
+ * to call __NR_restart_syscall to continue. Fix up here.
+ */
+ if (op == CKPT_RST && h->should_restart) {
+ regs->gprs[2] = __NR_restart_syscall;
+ set_thread_flag(TIF_RESTART_SVC);
+ }
CKPT_COPY_ARRAY(op, h->fprs, thr->fp_regs.fprs, NUM_FPRS);
CKPT_COPY_ARRAY(op, h->acrs, thr->acrs, NUM_ACRS);
CKPT_COPY_ARRAY(op, h->per_control_regs,
@@ -98,6 +108,12 @@ int checkpoint_cpu(struct ckpt_ctx *ctx, struct task_struct *t)
s390_copy_regs(CKPT_CPT, h, t);
+ /*
+ * if t was frozen while in a restartable syscall, note that
+ */
+ if (test_ti_thread_flag(task_thread_info(t), TIF_RESTARTBLOCK))
+ h->should_restart = 1;
+
ret = ckpt_write_obj(ctx, (struct ckpt_hdr *) h);
ckpt_hdr_put(ctx, h);
--
1.6.1
^ permalink raw reply related [flat|nested] 5+ messages in thread[parent not found: <20100121062802.GA2770-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/1] s390: actually restart syscalls after sys_restart [not found] ` <20100121062802.GA2770-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2010-01-21 15:21 ` Oren Laadan [not found] ` <Pine.LNX.4.64.1001210904210.6064-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Oren Laadan @ 2010-01-21 15:21 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Linux Containers Hi, I'm not sure thix fix is totally correct. First, a reminder of how -ERESTART... errors behave: 1) If a (user) signal handler was *not* called then: -ERESTARTSYS: re-execute syscall -ERESTARTNOHAND: re-execute syscall -ERESTARTNOINTR: re-execute syscall -ERESTART_RESTARTBLOCK: arrange to call sys_restart_syscall 2) If a (user) signal handler *was* called then: -ERESTARTSYS: re-execute syscall iff SA_RESTART (else EINTR) -ERESTARTNOHAND: return -EINTR -ERESTARTNOINTR: re-execute syscall -ERESTART_RESTARTBLOCK: return -EINTR The difference between x86 and s390 is _when_ (and how) the task state is altered (to arrange for re-execution), in case of task freezing: * In x86, changes occur _after_ the task is frozen, and the work is split between handle_signal() and do_signal(). * In s390, some changes occur _before_ the task is frozen, and possibly undone after the freeze if a signal handler was called. All work is done is do_signal(). Because of that, our checkpoint only sees a partial view of the task's state. Now back to the proposed fix: it isn't sufficient because: (a) If the process has a signal pending when checkpointed, then the state already reflects some changes by do_signal(). If the signal leads to a (user) signal handler after restart, then do_signal() (after restart) will _not_ undo the changes originally done before the checkpoint. This issue holds for ERESTARTSYS, ERESTARTNOHAND and ERESTART_RESTARTBLOCK. Note that do_signal() after restart will not do the changes again either in case of -ERESTART.... value, because the regs->svcnr was set to 0 and recorded that way. (b) Same, but also if the process didn't have a signal at checkpoint time, but will have one during/after restart but before resuming. (c) Because do_restart() selects its return value from gprs[2] (upon successful completion), then when it tries to resume userspace and enters do_signal() it will have -EINTR (which isn't the original return value!), and therefore not set your new TIF_RESTARTBLOCK bit. If the task is now checkpointed again, that state will be lost. As a side note: I noticed that on x86 there are {checkpoint,restore}_thread() that save the thread flags and restore the necessary flags. They also check the flags - at checkpoint to ensure the task is checkpointable, and at restore for sanity. Is there not a need for something similar for s390? That would be an appropriate place to svae and restore a TIF_RESTARTBLOCK thread flag and fix (c) above, should you stick to that method. Oren. On Thu, 21 Jan 2010, Serge E. Hallyn wrote: > On x86, do_signal() leaves -516 in eax while it freezes, which > sys_restart() can use to detect that it should restart the > syscall which was interrupted by a signal (or the freezer). > On s390, gprs[2] gets tweaked to -EINTR (-4) instead, leaving > us no reliable way to tell whether should be restarted. > > Define TIF_RESTARTBLOCK as a thread flag showing that the > thread expects to be frozen while kicked out of a restartable > syscall by a signal. > > This is needed so that, if it is checkpointed and restarted, > the restarted task has a way to tell that, upon completion > of sys_restart(), it should restart the interrupted syscall. > > Without this patch, restart of the program > > close(0); close(1); close(2); > sleep(30); > > immediately exits. With the patch, it continues to sleep for > the remaining sleep time. > > Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> > --- > arch/s390/include/asm/checkpoint_hdr.h | 1 + > arch/s390/include/asm/thread_info.h | 2 ++ > arch/s390/kernel/signal.c | 3 +++ > arch/s390/mm/checkpoint.c | 16 ++++++++++++++++ > 4 files changed, 22 insertions(+), 0 deletions(-) > > diff --git a/arch/s390/include/asm/checkpoint_hdr.h b/arch/s390/include/asm/checkpoint_hdr.h > index bc9f624..4ef14e8 100644 > --- a/arch/s390/include/asm/checkpoint_hdr.h > +++ b/arch/s390/include/asm/checkpoint_hdr.h > @@ -73,6 +73,7 @@ struct ckpt_hdr_cpu { > __u8 access_id; > __u8 single_step; > __u8 instruction_fetch; > + __u8 should_restart; > }; > > struct ckpt_hdr_mm_context { > diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h > index 07eb61b..2fac866 100644 > --- a/arch/s390/include/asm/thread_info.h > +++ b/arch/s390/include/asm/thread_info.h > @@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void) > #define TIF_MEMDIE 19 > #define TIF_RESTORE_SIGMASK 20 /* restore signal mask in do_signal() */ > #define TIF_FREEZE 21 /* thread is freezing for suspend */ > +#define TIF_RESTARTBLOCK 23 /* for checkpoint */ > > #define _TIF_NOTIFY_RESUME (1<<TIF_NOTIFY_RESUME) > #define _TIF_RESTORE_SIGMASK (1<<TIF_RESTORE_SIGMASK) > @@ -116,6 +117,7 @@ static inline struct thread_info *current_thread_info(void) > #define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG) > #define _TIF_31BIT (1<<TIF_31BIT) > #define _TIF_FREEZE (1<<TIF_FREEZE) > +#define _TIF_RESTARTBLOCK (1<<TIF_RESTARTBLOCK) > > #endif /* __KERNEL__ */ > > diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c > index 6b4fef8..1179b19 100644 > --- a/arch/s390/kernel/signal.c > +++ b/arch/s390/kernel/signal.c > @@ -458,6 +458,7 @@ void do_signal(struct pt_regs *regs) > regs->psw.addr = restart_addr; > break; > case -ERESTART_RESTARTBLOCK: > + set_thread_flag(TIF_RESTARTBLOCK); > regs->gprs[2] = -EINTR; > } > regs->svcnr = 0; /* Don't deal with this again. */ > @@ -467,6 +468,8 @@ void do_signal(struct pt_regs *regs) > the debugger may change all our registers ... */ > signr = get_signal_to_deliver(&info, &ka, regs, NULL); > > + clear_thread_flag(TIF_RESTARTBLOCK); > + > /* Depending on the signal settings we may need to revert the > decision to restart the system call. */ > if (signr > 0 && regs->psw.addr == restart_addr) { > diff --git a/arch/s390/mm/checkpoint.c b/arch/s390/mm/checkpoint.c > index 40dd417..d8d4b6b 100644 > --- a/arch/s390/mm/checkpoint.c > +++ b/arch/s390/mm/checkpoint.c > @@ -65,6 +65,16 @@ static void s390_copy_regs(int op, struct ckpt_hdr_cpu *h, > BUG_ON(h->gprs[2] < 0); > h->gprs[2] = 0; > } > + /* > + * if the checkpointed task was frozen in a syscall with > + * -ERESTART_RESTARTBLOCK (switched to -EINTR during do_signal() > + * before try_to_freeze() happened) * then after restart we need > + * to call __NR_restart_syscall to continue. Fix up here. > + */ > + if (op == CKPT_RST && h->should_restart) { > + regs->gprs[2] = __NR_restart_syscall; > + set_thread_flag(TIF_RESTART_SVC); > + } > CKPT_COPY_ARRAY(op, h->fprs, thr->fp_regs.fprs, NUM_FPRS); > CKPT_COPY_ARRAY(op, h->acrs, thr->acrs, NUM_ACRS); > CKPT_COPY_ARRAY(op, h->per_control_regs, > @@ -98,6 +108,12 @@ int checkpoint_cpu(struct ckpt_ctx *ctx, struct task_struct *t) > > s390_copy_regs(CKPT_CPT, h, t); > > + /* > + * if t was frozen while in a restartable syscall, note that > + */ > + if (test_ti_thread_flag(task_thread_info(t), TIF_RESTARTBLOCK)) > + h->should_restart = 1; > + > ret = ckpt_write_obj(ctx, (struct ckpt_hdr *) h); > ckpt_hdr_put(ctx, h); > > ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <Pine.LNX.4.64.1001210904210.6064-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>]
* Re: [PATCH 1/1] s390: actually restart syscalls after sys_restart [not found] ` <Pine.LNX.4.64.1001210904210.6064-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org> @ 2010-01-21 15:43 ` Serge E. Hallyn [not found] ` <20100121154332.GE6878-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Serge E. Hallyn @ 2010-01-21 15:43 UTC (permalink / raw) To: Oren Laadan; +Cc: Linux Containers Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org): > Hi, > > I'm not sure thix fix is totally correct. > > First, a reminder of how -ERESTART... errors behave: > > 1) If a (user) signal handler was *not* called then: > -ERESTARTSYS: re-execute syscall > -ERESTARTNOHAND: re-execute syscall > -ERESTARTNOINTR: re-execute syscall > -ERESTART_RESTARTBLOCK: arrange to call sys_restart_syscall > > 2) If a (user) signal handler *was* called then: > -ERESTARTSYS: re-execute syscall iff SA_RESTART (else EINTR) > -ERESTARTNOHAND: return -EINTR > -ERESTARTNOINTR: re-execute syscall > -ERESTART_RESTARTBLOCK: return -EINTR > > The difference between x86 and s390 is _when_ (and how) the task state is > altered (to arrange for re-execution), in case of task freezing: > > * In x86, changes occur _after_ the task is frozen, and the work is split > between handle_signal() and do_signal(). > > * In s390, some changes occur _before_ the task is frozen, and possibly > undone after the freeze if a signal handler was called. All work is done > is do_signal(). Because of that, our checkpoint only sees a partial view > of the task's state. > > Now back to the proposed fix: it isn't sufficient because: > > (a) If the process has a signal pending when checkpointed, then the state > already reflects some changes by do_signal(). If the signal leads to a > (user) signal handler after restart, then do_signal() (after restart) will > _not_ undo the changes originally done before the checkpoint. > > This issue holds for ERESTARTSYS, ERESTARTNOHAND and ERESTART_RESTARTBLOCK. > > Note that do_signal() after restart will not do the changes again either > in case of -ERESTART.... value, because the regs->svcnr was set to 0 and > recorded that way. > > (b) Same, but also if the process didn't have a signal at checkpoint > time, but will have one during/after restart but before resuming. > > (c) Because do_restart() selects its return value from gprs[2] (upon > successful completion), then when it tries to resume userspace and > enters do_signal() it will have -EINTR (which isn't the original return > value!), and therefore not set your new TIF_RESTARTBLOCK bit. If the task > is now checkpointed again, that state will be lost. Sorry if I'm misunderstanding what you're saying, but the return value after exiting sys_restart() won't be -EINTR now, bc my patch sets it to __NR_restart_syscall. > As a side note: I noticed that on x86 there are {checkpoint,restore}_thread() > that save the thread flags and restore the necessary flags. They also > check the flags - at checkpoint to ensure the task is checkpointable, and > at restore for sanity. Is there not a need for something similar for s390? > > That would be an appropriate place to svae and restore a TIF_RESTARTBLOCK > thread flag and fix (c) above, should you stick to that method. We actually want that flag cleared - the flag is only there so that checkpoint can tell that it needs to set the 'should_restart' flag in the thread info in the checkpoint image. sys_estart does not then reset that flag, it instead does the steps which are done in the last block of do_signal(): set gprs[2] to NR_syscall_restart and add the TIF_RESTART_SVC thread flag. -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20100121154332.GE6878-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/1] s390: actually restart syscalls after sys_restart [not found] ` <20100121154332.GE6878-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2010-01-21 16:00 ` Oren Laadan [not found] ` <Pine.LNX.4.64.1001211048260.10869-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Oren Laadan @ 2010-01-21 16:00 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Linux Containers On Thu, 21 Jan 2010, Serge E. Hallyn wrote: > Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org): > > Hi, > > > > I'm not sure thix fix is totally correct. > > > > First, a reminder of how -ERESTART... errors behave: > > > > 1) If a (user) signal handler was *not* called then: > > -ERESTARTSYS: re-execute syscall > > -ERESTARTNOHAND: re-execute syscall > > -ERESTARTNOINTR: re-execute syscall > > -ERESTART_RESTARTBLOCK: arrange to call sys_restart_syscall > > > > 2) If a (user) signal handler *was* called then: > > -ERESTARTSYS: re-execute syscall iff SA_RESTART (else EINTR) > > -ERESTARTNOHAND: return -EINTR > > -ERESTARTNOINTR: re-execute syscall > > -ERESTART_RESTARTBLOCK: return -EINTR > > > > The difference between x86 and s390 is _when_ (and how) the task state is > > altered (to arrange for re-execution), in case of task freezing: > > > > * In x86, changes occur _after_ the task is frozen, and the work is split > > between handle_signal() and do_signal(). > > > > * In s390, some changes occur _before_ the task is frozen, and possibly > > undone after the freeze if a signal handler was called. All work is done > > is do_signal(). Because of that, our checkpoint only sees a partial view > > of the task's state. > > > > Now back to the proposed fix: it isn't sufficient because: > > > > (a) If the process has a signal pending when checkpointed, then the state > > already reflects some changes by do_signal(). If the signal leads to a > > (user) signal handler after restart, then do_signal() (after restart) will > > _not_ undo the changes originally done before the checkpoint. > > > > This issue holds for ERESTARTSYS, ERESTARTNOHAND and ERESTART_RESTARTBLOCK. > > > > Note that do_signal() after restart will not do the changes again either > > in case of -ERESTART.... value, because the regs->svcnr was set to 0 and > > recorded that way. > > > > (b) Same, but also if the process didn't have a signal at checkpoint > > time, but will have one during/after restart but before resuming. > > > > (c) Because do_restart() selects its return value from gprs[2] (upon > > successful completion), then when it tries to resume userspace and > > enters do_signal() it will have -EINTR (which isn't the original return > > value!), and therefore not set your new TIF_RESTARTBLOCK bit. If the task > > is now checkpointed again, that state will be lost. > > Sorry if I'm misunderstanding what you're saying, but the return value > after exiting sys_restart() won't be -EINTR now, bc my patch sets it to > __NR_restart_syscall. True ... but the consequences are the same: Consider task A is checkpointed, then restarted, but before it completely resumes userspace, it is checkpointed again. For the second checkpoint, do_signal() will see svcnr==0, and even if you set svcnr, then gprs[2]>0. Therefore, do_signal() will not set (again) the TIF_RESTARTBLOCK flag because that happens only for svcnr!=0 and gprs[2]=-ERESTART_RESTARTBLOCK. So the second checkpoint image lost that particular state you were interested. Therefore, when you restart from that image, the gprs[2] will be set correctly (if ignoring issues a, b), but the TIF_RESTART_SVC won't be set again. > > > As a side note: I noticed that on x86 there are {checkpoint,restore}_thread() > > that save the thread flags and restore the necessary flags. They also > > check the flags - at checkpoint to ensure the task is checkpointable, and > > at restore for sanity. Is there not a need for something similar for s390? > > > > That would be an appropriate place to svae and restore a TIF_RESTARTBLOCK > > thread flag and fix (c) above, should you stick to that method. > > We actually want that flag cleared - the flag is only there so that > checkpoint can tell that it needs to set the 'should_restart' flag in > the thread info in the checkpoint image. sys_estart does not then reset > that flag, it instead does the steps which are done in the last block > of do_signal(): set gprs[2] to NR_syscall_restart and add the TIF_RESTART_SVC > thread flag. The problem is that sys_restart *does not even set* that flag, and do_signal() will not set that flag either after restarting from a checkpoint that had should_restart set. Therefore a subsequent checkpoint that occurs _before_ the process really resumes to userspace will not carry that information. Oren. ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <Pine.LNX.4.64.1001211048260.10869-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>]
* Re: [PATCH 1/1] s390: actually restart syscalls after sys_restart [not found] ` <Pine.LNX.4.64.1001211048260.10869-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org> @ 2010-01-22 4:11 ` Serge E. Hallyn 0 siblings, 0 replies; 5+ messages in thread From: Serge E. Hallyn @ 2010-01-22 4:11 UTC (permalink / raw) To: Oren Laadan; +Cc: Linux Containers Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org): > On Thu, 21 Jan 2010, Serge E. Hallyn wrote: > > > Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org): > > > Hi, > > > > > > I'm not sure thix fix is totally correct. > > > > > > First, a reminder of how -ERESTART... errors behave: > > > > > > 1) If a (user) signal handler was *not* called then: > > > -ERESTARTSYS: re-execute syscall > > > -ERESTARTNOHAND: re-execute syscall > > > -ERESTARTNOINTR: re-execute syscall > > > -ERESTART_RESTARTBLOCK: arrange to call sys_restart_syscall > > > > > > 2) If a (user) signal handler *was* called then: > > > -ERESTARTSYS: re-execute syscall iff SA_RESTART (else EINTR) > > > -ERESTARTNOHAND: return -EINTR > > > -ERESTARTNOINTR: re-execute syscall > > > -ERESTART_RESTARTBLOCK: return -EINTR > > > > > > The difference between x86 and s390 is _when_ (and how) the task state is > > > altered (to arrange for re-execution), in case of task freezing: > > > > > > * In x86, changes occur _after_ the task is frozen, and the work is split > > > between handle_signal() and do_signal(). > > > > > > * In s390, some changes occur _before_ the task is frozen, and possibly > > > undone after the freeze if a signal handler was called. All work is done > > > is do_signal(). Because of that, our checkpoint only sees a partial view > > > of the task's state. > > > > > > Now back to the proposed fix: it isn't sufficient because: > > > > > > (a) If the process has a signal pending when checkpointed, then the state > > > already reflects some changes by do_signal(). If the signal leads to a > > > (user) signal handler after restart, then do_signal() (after restart) will > > > _not_ undo the changes originally done before the checkpoint. > > > > > > This issue holds for ERESTARTSYS, ERESTARTNOHAND and ERESTART_RESTARTBLOCK. > > > > > > Note that do_signal() after restart will not do the changes again either > > > in case of -ERESTART.... value, because the regs->svcnr was set to 0 and > > > recorded that way. > > > > > > (b) Same, but also if the process didn't have a signal at checkpoint > > > time, but will have one during/after restart but before resuming. > > > > > > (c) Because do_restart() selects its return value from gprs[2] (upon > > > successful completion), then when it tries to resume userspace and > > > enters do_signal() it will have -EINTR (which isn't the original return > > > value!), and therefore not set your new TIF_RESTARTBLOCK bit. If the task > > > is now checkpointed again, that state will be lost. > > > > Sorry if I'm misunderstanding what you're saying, but the return value > > after exiting sys_restart() won't be -EINTR now, bc my patch sets it to > > __NR_restart_syscall. > > True ... but the consequences are the same: > > Consider task A is checkpointed, then restarted, but before it completely > resumes userspace, it is checkpointed again. For the second checkpoint, > do_signal() will see svcnr==0, and even if you set svcnr, then gprs[2]>0. > Therefore, do_signal() will not set (again) the TIF_RESTARTBLOCK flag > because that happens only for svcnr!=0 and gprs[2]=-ERESTART_RESTARTBLOCK. > > So the second checkpoint image lost that particular state you were > interested. Therefore, when you restart from that image, the gprs[2] will > be set correctly (if ignoring issues a, b), but the TIF_RESTART_SVC won't > be set again. > > > > > > As a side note: I noticed that on x86 there are {checkpoint,restore}_thread() > > > that save the thread flags and restore the necessary flags. They also > > > check the flags - at checkpoint to ensure the task is checkpointable, and > > > at restore for sanity. Is there not a need for something similar for s390? > > > > > > That would be an appropriate place to svae and restore a TIF_RESTARTBLOCK > > > thread flag and fix (c) above, should you stick to that method. > > > > We actually want that flag cleared - the flag is only there so that > > checkpoint can tell that it needs to set the 'should_restart' flag in > > the thread info in the checkpoint image. sys_estart does not then reset > > that flag, it instead does the steps which are done in the last block > > of do_signal(): set gprs[2] to NR_syscall_restart and add the TIF_RESTART_SVC > > thread flag. > > The problem is that sys_restart *does not even set* that flag, and > do_signal() will not set that flag either after restarting from a > checkpoint that had should_restart set. Therefore a subsequent checkpoint > that occurs _before_ the process really resumes to userspace will not > carry that information. Good point. But your earlier proposal of restoring TIF_RESTARTBLOCK in a arch_restore_thread() isn't quite right, because it needs to get cleared at some point, and we wouldn't know when to do that. I'll have to think about this more. (One possibility is to have the top of do_signal() also set TIF_RESTARTBLOCK if TIF_RESTART_SVC is already set, which it would be if do_signal were called immediately after sys_restart and before the original syscall were restarted. But I suspect there is a complicating subtlety there as well...) thanks, -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-01-22 4:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-21 6:28 [PATCH 1/1] s390: actually restart syscalls after sys_restart Serge E. Hallyn
[not found] ` <20100121062802.GA2770-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-01-21 15:21 ` Oren Laadan
[not found] ` <Pine.LNX.4.64.1001210904210.6064-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2010-01-21 15:43 ` Serge E. Hallyn
[not found] ` <20100121154332.GE6878-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-01-21 16:00 ` Oren Laadan
[not found] ` <Pine.LNX.4.64.1001211048260.10869-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2010-01-22 4:11 ` Serge E. Hallyn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.