* script from 2.25.1 may be broken (hangs) @ 2014-10-08 22:25 Thorsten Glaser 2014-10-09 10:53 ` Adam Sampson 0 siblings, 1 reply; 5+ messages in thread From: Thorsten Glaser @ 2014-10-08 22:25 UTC (permalink / raw) To: util-linux Hi, I'm the upstream of https://www.mirbsd.org/mksh.htm whose regression tests I normally run, in packages, under script(1) because they need a tty, which automated build machines from distributions do not normally provide. I notice that in both Debian sid and OpenSuSE Factory, util-linux 2.25.1 is used, and in both, attempts to build mksh hang. Debian: building in jessie (util-linux 2.20.1) works, building other packages in sid works OpenSuSE: not using script(1) to run the testsuite works. How to reproduce: download https://www.mirbsd.org/MirOS/dist/mir/mksh/mksh-R50d.tgz then extract and build it: $ tar xzf mksh-R50d.tgz $ cd mksh $ sh Build.sh -r $ script -qc './test.sh -v' </dev/null 2>&1 | tee log This exits after expand-ugly. My OBS builds hang after heredoc-weird-5 which is a bit later in check.t (the testsuite). On Debian, downgrading the bsdutils package to 2.20.1 fixes the problem. Please keep me in Cc when replying, I am not subscribed to the newsgroup. Thanks for your consideration! bye, //mirabilos ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: script from 2.25.1 may be broken (hangs) 2014-10-08 22:25 script from 2.25.1 may be broken (hangs) Thorsten Glaser @ 2014-10-09 10:53 ` Adam Sampson 2014-10-09 12:21 ` Karel Zak 0 siblings, 1 reply; 5+ messages in thread From: Adam Sampson @ 2014-10-09 10:53 UTC (permalink / raw) To: Thorsten Glaser; +Cc: util-linux Thorsten Glaser <tg@mirbsd.org> writes: > $ script -qc './test.sh -v' </dev/null 2>&1 | tee log For me, this doesn't hang, but it does exit before test.sh is actually finished. Here's a simpler example that does the same thing: $ cat simpler.sh #!/bin/sh echo one sleep 2 echo two $ script -c './simpler.sh' Script started, file is typescript one two Script done, file is typescript $ script -c './simpler.sh' </dev/null Script started, file is typescript one Script done, file is typescript The "wait for children" code at the end of doinput looks suspicious to me -- finish() doesn't actually block, as the comment implies, just checks to see if any children have finished. Running Thorsten's command under strace -f reveals: [21180 is the script process running doinput, 21181 is running dooutput] 21180 write(4, "\x04", 1 <unfinished ...> 21180 <... write resumed> ) = 1 21180 poll([{fd=4, events=POLLIN}], 1, 100 <unfinished ...> 21180 <... poll resumed> ) = 1 ([{fd=4, revents=POLLIN}]) 21180 poll([{fd=4, events=POLLIN}], 1, 100) = 1 ([{fd=4, revents=POLLIN}]) 21180 poll([{fd=4, events=POLLIN}], 1, 100) = 1 ([{fd=4, revents=POLLIN}]) 21180 poll([{fd=4, events=POLLIN}], 1, 100 <unfinished ...> [lots more poll calls while other stuff happens in child processes] 21180 <... poll resumed> ) = 0 (Timeout) 21180 wait4(-1, 0x7fff47830c54, WNOHANG, NULL) = 0 21180 kill(21181, SIGTERM) = 0 But at this point the child is still running. So it looks like doinput isn't waiting for the child to exit correctly, when stdin has hit EOF before the child has finished. Thanks, -- Adam Sampson <ats@offog.org> <http://offog.org/> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: script from 2.25.1 may be broken (hangs) 2014-10-09 10:53 ` Adam Sampson @ 2014-10-09 12:21 ` Karel Zak 2014-10-09 13:25 ` Fwd: " Thorsten Glaser 0 siblings, 1 reply; 5+ messages in thread From: Karel Zak @ 2014-10-09 12:21 UTC (permalink / raw) To: Adam Sampson; +Cc: Thorsten Glaser, util-linux On Thu, Oct 09, 2014 at 11:53:15AM +0100, Adam Sampson wrote: > Thorsten Glaser <tg@mirbsd.org> writes: > > > $ script -qc './test.sh -v' </dev/null 2>&1 | tee log > > For me, this doesn't hang, but it does exit before test.sh is actually > finished. Here's a simpler example that does the same thing: > > $ cat simpler.sh > #!/bin/sh > echo one > sleep 2 > echo two > $ script -c './simpler.sh' > Script started, file is typescript > one > two > Script done, file is typescript > $ script -c './simpler.sh' </dev/null > Script started, file is typescript > one > Script done, file is typescript > > The "wait for children" code at the end of doinput looks suspicious to > me -- finish() doesn't actually block, as the comment implies, just hmm.. because WNOHANG, it seems we need a one function for signal handler (with WNOHANG) and another function for the real program termination (without WNOHANG). Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Fwd: script from 2.25.1 may be broken (hangs) 2014-10-09 12:21 ` Karel Zak @ 2014-10-09 13:25 ` Thorsten Glaser 2014-10-14 10:12 ` Karel Zak 0 siblings, 1 reply; 5+ messages in thread From: Thorsten Glaser @ 2014-10-09 13:25 UTC (permalink / raw) To: Karel Zak; +Cc: Adam Sampson, util-linux [-- Attachment #1: Type: TEXT/PLAIN, Size: 613 bytes --] From: Andreas Henriksson <andreas@fatal.se> Message-ID: <20141009130557.GA20938@fatal.se> To: 764547@bugs.debian.org Date: Thu, 9 Oct 2014 15:05:57 +0200 Subject: Re: Bug#764547: Fwd: script from 2.25.1 may be broken (hangs) Thanks for the improved testcase. Spent 2 seconds looking and the finish indeed looks like it should not be using WNOHANG (atleast) when explicitly called. Eg. like the attached patch. Would be great if someone was willing to take charge of looking over this and getting a fix merged upstream. Poke me once merged and a backport should be a trivial matter. Regards, Andreas Henriksson [-- Attachment #2: Type: TEXT/PLAIN, Size: 1273 bytes --] diff --git a/term-utils/script.c b/term-utils/script.c index b9f8738..b12b7fd 100644 --- a/term-utils/script.c +++ b/term-utils/script.c @@ -80,6 +80,7 @@ #define DEFAULT_OUTPUT "typescript" +void sig_finish(int); void finish(int); void done(void); void fail(void); @@ -258,7 +259,7 @@ main(int argc, char **argv) { /* setup SIGCHLD handler */ sigemptyset(&sa.sa_mask); sa.sa_flags = 0; - sa.sa_handler = finish; + sa.sa_handler = sig_finish; sigaction(SIGCHLD, &sa, NULL); /* init mask for SIGCHLD */ @@ -385,17 +386,18 @@ doinput(void) { } if (!die) - finish(0); /* wait for childern */ + finish(1); /* wait for children */ done(); } void -finish(int dummy __attribute__ ((__unused__))) { +finish(int wait) { int status; pid_t pid; int errsv = errno; + int options = wait ? 0 : WNOHANG; - while ((pid = wait3(&status, WNOHANG, 0)) > 0) + while ((pid = wait3(&status, options, 0)) > 0) if (pid == child) { childstatus = status; die = 1; @@ -405,6 +407,11 @@ finish(int dummy __attribute__ ((__unused__))) { } void +sig_finish(int dummy __attribute__ ((__unused__))) { + finish(0); +} + +void resize(int dummy __attribute__ ((__unused__))) { resized = 1; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Fwd: script from 2.25.1 may be broken (hangs) 2014-10-09 13:25 ` Fwd: " Thorsten Glaser @ 2014-10-14 10:12 ` Karel Zak 0 siblings, 0 replies; 5+ messages in thread From: Karel Zak @ 2014-10-14 10:12 UTC (permalink / raw) To: Thorsten Glaser; +Cc: Adam Sampson, util-linux On Thu, Oct 09, 2014 at 01:25:12PM +0000, Thorsten Glaser wrote: > diff --git a/term-utils/script.c b/term-utils/script.c > index b9f8738..b12b7fd 100644 > --- a/term-utils/script.c > +++ b/term-utils/script.c Applied, thanks. Please, don't use attachments for patches next time. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-10-14 10:12 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-08 22:25 script from 2.25.1 may be broken (hangs) Thorsten Glaser 2014-10-09 10:53 ` Adam Sampson 2014-10-09 12:21 ` Karel Zak 2014-10-09 13:25 ` Fwd: " Thorsten Glaser 2014-10-14 10:12 ` Karel Zak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox