* [PATCH 0/2] cryo: Re-enable checkpointing of thread area
@ 2008-06-11 14:13 Benjamin Thery
2008-06-11 14:14 ` [PATCH 1/2] cryo: re-enable " Benjamin Thery
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Benjamin Thery @ 2008-06-11 14:13 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Containers, Benjamin Thery
I found the cause of one of the general protection faults I saw with
my test program and I finally managed to completely restart (a very
dumb) program for the first time!
My program was failing (GPF) at restart in glibc code. After some
debugging I found the failures occur on SINGLE_THREAD_P calls
(eg. glibc/sysdeps/posix/system.c:__libc_system()).
I suspected a problem with nptl and remembered the comments in cr.c
("for redhat 9.0, NPTL") and in cr.txt ("Support linuxthreads, but not
NPTL."). I uncommented this code that checkpoints the thread area
(don't ask me what it is) and, voila, my program restarted!
It doesn't solve everything: I still have issues restarting the 'sleep'
program.
Benjamin
--
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/2] cryo: re-enable checkpointing of thread area 2008-06-11 14:13 [PATCH 0/2] cryo: Re-enable checkpointing of thread area Benjamin Thery @ 2008-06-11 14:14 ` Benjamin Thery [not found] ` <20080611141408.977819123-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org> 2008-06-11 14:14 ` [PATCH 2/2] cryo: minimal test program Benjamin Thery [not found] ` <20080611141350.541711754-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org> 2 siblings, 1 reply; 5+ messages in thread From: Benjamin Thery @ 2008-06-11 14:14 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers, Benjamin Thery This patch re-enable the code that checkpoints (and restore) and thread area (ldt) using ptrace_get_thread_area(). This is seem to improve the situation a lot on systems with NPTL: it solved one of the general protection fault I had when restarting a program. Signed-off-by: Benjamin Thery <benjamin.thery-6ktuUTfB/bM@public.gmane.org> --- cr.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) Index: cryodev/cr.c =================================================================== --- cryodev.orig/cr.c +++ cryodev/cr.c @@ -24,7 +24,7 @@ #include <signal.h> #include <errno.h> -#include <asm/ldt.h> /* for redhat 9.0, NPTL */ +#include <asm/ldt.h> /* for NPTL */ #include "utils.h" #include "sci.h" @@ -513,14 +513,12 @@ static int save_process_data(pid_t pid, return 0; } - /* This is required in redhat9 */ -#if 0 + /* This is required for NPTL */ { - modify_ldt_t ldt; + struct user_desc ldt; if (ptrace_get_thread_area(pid, &ldt) == 0) write_item(fd, "ldt", (void *)&ldt, sizeof(ldt)); } -#endif snprintf(fname, sizeof(fname), "/proc/%u/exe", pid); memset(exe, 0, sizeof(exe)); @@ -1237,7 +1235,7 @@ static int process_restart(int fd, int m char *exe = NULL, *cwd = NULL, *sargv = NULL, *senv = NULL; struct user_regs_struct *regs = NULL; struct user_fpregs_struct *fpregs = NULL; - //modify_ldt_t *ldt = NULL; + struct user_desc *ldt = NULL; int *exitsig = NULL; sigset_t *sigmask = NULL, *sigpend = NULL; struct sigaction *sigact = NULL; @@ -1262,7 +1260,7 @@ static int process_restart(int fd, int m Free(senv); Free(regs); Free(fpregs); - //Free(ldt); + Free(ldt); Free(sigact); Free(sigmask); Free(sigpend); @@ -1276,7 +1274,7 @@ static int process_restart(int fd, int m else ITEM_SET(cwd, char); else ITEM_SET(regs, struct user_regs_struct); else ITEM_SET(fpregs, struct user_fpregs_struct); - //else ITEM_SET(ldt, modify_ldt_t); + else ITEM_SET(ldt, struct user_desc); else ITEM_SET(sigact, struct sigaction); else ITEM_SET(sigmask, sigset_t); else ITEM_SET(sigpend, sigset_t); @@ -1304,7 +1302,8 @@ static int process_restart(int fd, int m ERROR("lh_hash_add(%p, %u, %p)\n", &hpid, (unsigned)*pid, (void *)npid); return -1; } - //if (ldt) ptrace_set_thread_area(npid, ldt); + if (ldt) + ptrace_set_thread_area(npid, ldt); if (cwd) PT_CHDIR(npid, cwd); restore_fd(fd, npid); } else if (ITEM_IS("SOCK")) { -- ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20080611141408.977819123-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org>]
* Re: [PATCH 1/2] cryo: re-enable checkpointing of thread area [not found] ` <20080611141408.977819123-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org> @ 2008-06-11 15:28 ` Serge E. Hallyn 0 siblings, 0 replies; 5+ messages in thread From: Serge E. Hallyn @ 2008-06-11 15:28 UTC (permalink / raw) To: Benjamin Thery; +Cc: Containers Quoting Benjamin Thery (benjamin.thery-6ktuUTfB/bM@public.gmane.org): > This patch re-enable the code that checkpoints (and restore) and thread > area (ldt) using ptrace_get_thread_area(). This is seem to improve the > situation a lot on systems with NPTL: it solved one of the general > protection fault I had when restarting a program. > > Signed-off-by: Benjamin Thery <benjamin.thery-6ktuUTfB/bM@public.gmane.org> Benjamin, you rock. This fixes my kvm image as well. Nadia, Suka, could you confirm that this does *not* break cryo on your systems? Patched added to git tree. thanks, -serge > --- > cr.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > Index: cryodev/cr.c > =================================================================== > --- cryodev.orig/cr.c > +++ cryodev/cr.c > @@ -24,7 +24,7 @@ > #include <signal.h> > #include <errno.h> > > -#include <asm/ldt.h> /* for redhat 9.0, NPTL */ > +#include <asm/ldt.h> /* for NPTL */ > > #include "utils.h" > #include "sci.h" > @@ -513,14 +513,12 @@ static int save_process_data(pid_t pid, > return 0; > } > > - /* This is required in redhat9 */ > -#if 0 > + /* This is required for NPTL */ > { > - modify_ldt_t ldt; > + struct user_desc ldt; > if (ptrace_get_thread_area(pid, &ldt) == 0) > write_item(fd, "ldt", (void *)&ldt, sizeof(ldt)); > } > -#endif > > snprintf(fname, sizeof(fname), "/proc/%u/exe", pid); > memset(exe, 0, sizeof(exe)); > @@ -1237,7 +1235,7 @@ static int process_restart(int fd, int m > char *exe = NULL, *cwd = NULL, *sargv = NULL, *senv = NULL; > struct user_regs_struct *regs = NULL; > struct user_fpregs_struct *fpregs = NULL; > - //modify_ldt_t *ldt = NULL; > + struct user_desc *ldt = NULL; > int *exitsig = NULL; > sigset_t *sigmask = NULL, *sigpend = NULL; > struct sigaction *sigact = NULL; > @@ -1262,7 +1260,7 @@ static int process_restart(int fd, int m > Free(senv); > Free(regs); > Free(fpregs); > - //Free(ldt); > + Free(ldt); > Free(sigact); > Free(sigmask); > Free(sigpend); > @@ -1276,7 +1274,7 @@ static int process_restart(int fd, int m > else ITEM_SET(cwd, char); > else ITEM_SET(regs, struct user_regs_struct); > else ITEM_SET(fpregs, struct user_fpregs_struct); > - //else ITEM_SET(ldt, modify_ldt_t); > + else ITEM_SET(ldt, struct user_desc); > else ITEM_SET(sigact, struct sigaction); > else ITEM_SET(sigmask, sigset_t); > else ITEM_SET(sigpend, sigset_t); > @@ -1304,7 +1302,8 @@ static int process_restart(int fd, int m > ERROR("lh_hash_add(%p, %u, %p)\n", &hpid, (unsigned)*pid, (void *)npid); > return -1; > } > - //if (ldt) ptrace_set_thread_area(npid, ldt); > + if (ldt) > + ptrace_set_thread_area(npid, ldt); > if (cwd) PT_CHDIR(npid, cwd); > restore_fd(fd, npid); > } else if (ITEM_IS("SOCK")) { > > -- ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] cryo: minimal test program 2008-06-11 14:13 [PATCH 0/2] cryo: Re-enable checkpointing of thread area Benjamin Thery 2008-06-11 14:14 ` [PATCH 1/2] cryo: re-enable " Benjamin Thery @ 2008-06-11 14:14 ` Benjamin Thery [not found] ` <20080611141350.541711754-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org> 2 siblings, 0 replies; 5+ messages in thread From: Benjamin Thery @ 2008-06-11 14:14 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers, Benjamin Thery This is the dumb test program I managed to restart after I re-enabled the checkpointing of thread area stuff. Signed-off-by: Benjamin Thery <benjamin.thery-6ktuUTfB/bM@public.gmane.org> -- --- tests/Makefile | 2 +- tests/compute.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) Index: cryodev/tests/Makefile =================================================================== --- cryodev.orig/tests/Makefile +++ cryodev/tests/Makefile @@ -1,4 +1,4 @@ -TARGETS = sleep mksysvipc pause_asm +TARGETS = sleep mksysvipc pause_asm compute CFLAGS = -static Index: cryodev/tests/compute.c =================================================================== --- /dev/null +++ cryodev/tests/compute.c @@ -0,0 +1,17 @@ +#include <stdio.h> +#include <sys/types.h> +#include <unistd.h> + +int main() +{ + int i = 0; + double f = 0; + + printf("Running as %d\n", getpid()); + while (i<1000000000) { + f = i / 0.000234567; + if (i%10000000 == 0) + printf("i is %d (pid %d)\n", i, getpid()); + i++; + } +} -- ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20080611141350.541711754-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org>]
* Re: [PATCH 0/2] cryo: Re-enable checkpointing of thread area [not found] ` <20080611141350.541711754-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org> @ 2008-06-11 14:41 ` Benjamin Thery 0 siblings, 0 replies; 5+ messages in thread From: Benjamin Thery @ 2008-06-11 14:41 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: Containers Benjamin Thery wrote: > I found the cause of one of the general protection faults I saw with > my test program and I finally managed to completely restart (a very > dumb) program for the first time! > > My program was failing (GPF) at restart in glibc code. After some > debugging I found the failures occur on SINGLE_THREAD_P calls > (eg. glibc/sysdeps/posix/system.c:__libc_system()). > > I suspected a problem with nptl and remembered the comments in cr.c > ("for redhat 9.0, NPTL") and in cr.txt ("Support linuxthreads, but not > NPTL."). I uncommented this code that checkpoints the thread area > (don't ask me what it is) and, voila, my program restarted! > > It doesn't solve everything: I still have issues restarting the 'sleep' > program. I spoke too fast... in fact I have no more issue with sleep or mksysvipc programs. They both restart fine now. :) Benjamin -- B e n j a m i n T h e r y - BULL/DT/Open Software R&D http://www.bull.com ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-06-11 15:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-11 14:13 [PATCH 0/2] cryo: Re-enable checkpointing of thread area Benjamin Thery
2008-06-11 14:14 ` [PATCH 1/2] cryo: re-enable " Benjamin Thery
[not found] ` <20080611141408.977819123-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org>
2008-06-11 15:28 ` Serge E. Hallyn
2008-06-11 14:14 ` [PATCH 2/2] cryo: minimal test program Benjamin Thery
[not found] ` <20080611141350.541711754-4vkkeT0zb4ZEtYaxpPmRp1aPQRlvutdw@public.gmane.org>
2008-06-11 14:41 ` [PATCH 0/2] cryo: Re-enable checkpointing of thread area Benjamin Thery
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox