* script from 2.25.1 may be broken (hangs)
@ 2014-10-08 22:25 Thorsten Glaser
2014-10-09 10:53 ` Adam Sampson
0 siblings, 1 reply; 5+ messages in thread
From: Thorsten Glaser @ 2014-10-08 22:25 UTC (permalink / raw)
To: util-linux
Hi,
I'm the upstream of https://www.mirbsd.org/mksh.htm whose regression tests
I normally run, in packages, under script(1) because they need a tty, which
automated build machines from distributions do not normally provide.
I notice that in both Debian sid and OpenSuSE Factory, util-linux 2.25.1
is used, and in both, attempts to build mksh hang.
Debian: building in jessie (util-linux 2.20.1) works,
building other packages in sid works
OpenSuSE: not using script(1) to run the testsuite works.
How to reproduce: download
https://www.mirbsd.org/MirOS/dist/mir/mksh/mksh-R50d.tgz
then extract and build it:
$ tar xzf mksh-R50d.tgz
$ cd mksh
$ sh Build.sh -r
$ script -qc './test.sh -v' </dev/null 2>&1 | tee log
This exits after expand-ugly. My OBS builds hang after
heredoc-weird-5 which is a bit later in check.t (the testsuite).
On Debian, downgrading the bsdutils package to 2.20.1
fixes the problem.
Please keep me in Cc when replying, I am not subscribed to the newsgroup.
Thanks for your consideration!
bye,
//mirabilos
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: script from 2.25.1 may be broken (hangs)
2014-10-08 22:25 script from 2.25.1 may be broken (hangs) Thorsten Glaser
@ 2014-10-09 10:53 ` Adam Sampson
2014-10-09 12:21 ` Karel Zak
0 siblings, 1 reply; 5+ messages in thread
From: Adam Sampson @ 2014-10-09 10:53 UTC (permalink / raw)
To: Thorsten Glaser; +Cc: util-linux
Thorsten Glaser <tg@mirbsd.org> writes:
> $ script -qc './test.sh -v' </dev/null 2>&1 | tee log
For me, this doesn't hang, but it does exit before test.sh is actually
finished. Here's a simpler example that does the same thing:
$ cat simpler.sh
#!/bin/sh
echo one
sleep 2
echo two
$ script -c './simpler.sh'
Script started, file is typescript
one
two
Script done, file is typescript
$ script -c './simpler.sh' </dev/null
Script started, file is typescript
one
Script done, file is typescript
The "wait for children" code at the end of doinput looks suspicious to
me -- finish() doesn't actually block, as the comment implies, just
checks to see if any children have finished. Running Thorsten's command
under strace -f reveals:
[21180 is the script process running doinput, 21181 is running dooutput]
21180 write(4, "\x04", 1 <unfinished ...>
21180 <... write resumed> ) = 1
21180 poll([{fd=4, events=POLLIN}], 1, 100 <unfinished ...>
21180 <... poll resumed> ) = 1 ([{fd=4, revents=POLLIN}])
21180 poll([{fd=4, events=POLLIN}], 1, 100) = 1 ([{fd=4, revents=POLLIN}])
21180 poll([{fd=4, events=POLLIN}], 1, 100) = 1 ([{fd=4, revents=POLLIN}])
21180 poll([{fd=4, events=POLLIN}], 1, 100 <unfinished ...>
[lots more poll calls while other stuff happens in child processes]
21180 <... poll resumed> ) = 0 (Timeout)
21180 wait4(-1, 0x7fff47830c54, WNOHANG, NULL) = 0
21180 kill(21181, SIGTERM) = 0
But at this point the child is still running. So it looks like doinput
isn't waiting for the child to exit correctly, when stdin has hit EOF
before the child has finished.
Thanks,
--
Adam Sampson <ats@offog.org> <http://offog.org/>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: script from 2.25.1 may be broken (hangs)
2014-10-09 10:53 ` Adam Sampson
@ 2014-10-09 12:21 ` Karel Zak
2014-10-09 13:25 ` Fwd: " Thorsten Glaser
0 siblings, 1 reply; 5+ messages in thread
From: Karel Zak @ 2014-10-09 12:21 UTC (permalink / raw)
To: Adam Sampson; +Cc: Thorsten Glaser, util-linux
On Thu, Oct 09, 2014 at 11:53:15AM +0100, Adam Sampson wrote:
> Thorsten Glaser <tg@mirbsd.org> writes:
>
> > $ script -qc './test.sh -v' </dev/null 2>&1 | tee log
>
> For me, this doesn't hang, but it does exit before test.sh is actually
> finished. Here's a simpler example that does the same thing:
>
> $ cat simpler.sh
> #!/bin/sh
> echo one
> sleep 2
> echo two
> $ script -c './simpler.sh'
> Script started, file is typescript
> one
> two
> Script done, file is typescript
> $ script -c './simpler.sh' </dev/null
> Script started, file is typescript
> one
> Script done, file is typescript
>
> The "wait for children" code at the end of doinput looks suspicious to
> me -- finish() doesn't actually block, as the comment implies, just
hmm.. because WNOHANG, it seems we need a one function for signal
handler (with WNOHANG) and another function for the real program
termination (without WNOHANG).
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Fwd: script from 2.25.1 may be broken (hangs)
2014-10-09 12:21 ` Karel Zak
@ 2014-10-09 13:25 ` Thorsten Glaser
2014-10-14 10:12 ` Karel Zak
0 siblings, 1 reply; 5+ messages in thread
From: Thorsten Glaser @ 2014-10-09 13:25 UTC (permalink / raw)
To: Karel Zak; +Cc: Adam Sampson, util-linux
[-- Attachment #1: Type: TEXT/PLAIN, Size: 613 bytes --]
From: Andreas Henriksson <andreas@fatal.se>
Message-ID: <20141009130557.GA20938@fatal.se>
To: 764547@bugs.debian.org
Date: Thu, 9 Oct 2014 15:05:57 +0200
Subject: Re: Bug#764547: Fwd: script from 2.25.1 may be broken (hangs)
Thanks for the improved testcase.
Spent 2 seconds looking and the finish indeed looks like it should
not be using WNOHANG (atleast) when explicitly called. Eg. like
the attached patch.
Would be great if someone was willing to take charge of looking
over this and getting a fix merged upstream. Poke me once merged
and a backport should be a trivial matter.
Regards,
Andreas Henriksson
[-- Attachment #2: Type: TEXT/PLAIN, Size: 1273 bytes --]
diff --git a/term-utils/script.c b/term-utils/script.c
index b9f8738..b12b7fd 100644
--- a/term-utils/script.c
+++ b/term-utils/script.c
@@ -80,6 +80,7 @@
#define DEFAULT_OUTPUT "typescript"
+void sig_finish(int);
void finish(int);
void done(void);
void fail(void);
@@ -258,7 +259,7 @@ main(int argc, char **argv) {
/* setup SIGCHLD handler */
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
- sa.sa_handler = finish;
+ sa.sa_handler = sig_finish;
sigaction(SIGCHLD, &sa, NULL);
/* init mask for SIGCHLD */
@@ -385,17 +386,18 @@ doinput(void) {
}
if (!die)
- finish(0); /* wait for childern */
+ finish(1); /* wait for children */
done();
}
void
-finish(int dummy __attribute__ ((__unused__))) {
+finish(int wait) {
int status;
pid_t pid;
int errsv = errno;
+ int options = wait ? 0 : WNOHANG;
- while ((pid = wait3(&status, WNOHANG, 0)) > 0)
+ while ((pid = wait3(&status, options, 0)) > 0)
if (pid == child) {
childstatus = status;
die = 1;
@@ -405,6 +407,11 @@ finish(int dummy __attribute__ ((__unused__))) {
}
void
+sig_finish(int dummy __attribute__ ((__unused__))) {
+ finish(0);
+}
+
+void
resize(int dummy __attribute__ ((__unused__))) {
resized = 1;
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Fwd: script from 2.25.1 may be broken (hangs)
2014-10-09 13:25 ` Fwd: " Thorsten Glaser
@ 2014-10-14 10:12 ` Karel Zak
0 siblings, 0 replies; 5+ messages in thread
From: Karel Zak @ 2014-10-14 10:12 UTC (permalink / raw)
To: Thorsten Glaser; +Cc: Adam Sampson, util-linux
On Thu, Oct 09, 2014 at 01:25:12PM +0000, Thorsten Glaser wrote:
> diff --git a/term-utils/script.c b/term-utils/script.c
> index b9f8738..b12b7fd 100644
> --- a/term-utils/script.c
> +++ b/term-utils/script.c
Applied, thanks.
Please, don't use attachments for patches next time.
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-10-14 10:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-08 22:25 script from 2.25.1 may be broken (hangs) Thorsten Glaser
2014-10-09 10:53 ` Adam Sampson
2014-10-09 12:21 ` Karel Zak
2014-10-09 13:25 ` Fwd: " Thorsten Glaser
2014-10-14 10:12 ` Karel Zak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).