* Re: 2.6.25.3: su gets stuck for root @ 2008-06-02 1:31 Joe Peterson 2008-06-02 5:12 ` Harald Dunkel 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-02 1:31 UTC (permalink / raw) To: harald.dunkel, linux-kernel; +Cc: Alan Cox Hi Harold, I just also discovered this problem independently, and when I tracked it down to stty and googled for it, I found your post. In my test case, it seems to get stuck in stty as run from the user's .bashrc (i.e., "su user", where the user's .bashrc has the stty command). In my case, the arguments to stty do not seem to matter (well, I've tried "-ixany" and "echoctl" - same results). Also, the problem is made more reliable if a sleep is done before the stty. E.g., here's my test .bashrc: sleep 2 stty -ixany Note that if run from the console or a tty, having the user logged in already seems to avoid the hang, but doing it within an xterm shows the hang. Strange, since with my original [more complex] test case, it seemed to require *not* running X (tty/console only). Most recent kernels show the issue - the only one that doesn't is 2.6.25-git17. I am running Gentoo. It does happen in a recent 2.6.26 git (an rc4 git from a couple of days ago). Doing "ps" while hung shows stty in the "T" state. "killall -9 stty" releases it. -Joe P.S. Please cc my address on reply. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 1:31 2.6.25.3: su gets stuck for root Joe Peterson @ 2008-06-02 5:12 ` Harald Dunkel 2008-06-02 5:32 ` Willy Tarreau 2008-06-02 5:42 ` 2.6.25.3: su gets stuck for root Joe Peterson 0 siblings, 2 replies; 54+ messages in thread From: Harald Dunkel @ 2008-06-02 5:12 UTC (permalink / raw) To: Joe Peterson; +Cc: linux-kernel, Alan Cox Hi Joe, Joe Peterson wrote: > Hi Harold, > > Doing "ps" while hung shows stty in the "T" state. "killall -9 stty" > releases it. > Does strace give you the same output if you attach it to the blocking stty (strace -p $pid)? I got : ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- : Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 5:12 ` Harald Dunkel @ 2008-06-02 5:32 ` Willy Tarreau 2008-06-02 5:55 ` Joe Peterson 2008-06-02 8:10 ` Alan Cox 2008-06-02 5:42 ` 2.6.25.3: su gets stuck for root Joe Peterson 1 sibling, 2 replies; 54+ messages in thread From: Willy Tarreau @ 2008-06-02 5:32 UTC (permalink / raw) To: Harald Dunkel; +Cc: Joe Peterson, linux-kernel, Alan Cox On Mon, Jun 02, 2008 at 07:12:06AM +0200, Harald Dunkel wrote: > Hi Joe, > > Joe Peterson wrote: > >Hi Harold, > > > >Doing "ps" while hung shows stty in the "T" state. "killall -9 stty" > >releases it. > > > > Does strace give you the same output if you attach it to the blocking > stty (strace -p $pid)? > > I got > > > : > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = > ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > : Guys, you should test if "kill -CONT $pid" wakes the process up. It might be possible that some obscure bug appeared in the tty code resulting in SIGTTOU sometimes being sent to the caller, although that seems rather strange :-/ Willy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 5:32 ` Willy Tarreau @ 2008-06-02 5:55 ` Joe Peterson 2008-06-02 8:10 ` Alan Cox 1 sibling, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-06-02 5:55 UTC (permalink / raw) To: Willy Tarreau; +Cc: Harald Dunkel, linux-kernel, Alan Cox Willy Tarreau wrote: > Guys, you should test if "kill -CONT $pid" wakes the process up. > It might be possible that some obscure bug appeared in the tty > code resulting in SIGTTOU sometimes being sent to the caller, > although that seems rather strange :-/ Just tried this ("kill -CONT <pid>") - no luck. BTW, it should be possible, I would think, for others to duplicate this fairly easily. Just: 1) make a user, "foo", with login shell set to /bin/bash 2) create a .bashrc in foo's home dir with contents: sleep 2 stty -ixany 3) cp .bashrc .bash_profile (only needed to test "su - foo" too) 4) become root 5) type "su foo" (or "su - foo") Sometimes it takes a second try to get it to happen. If the su hangs, check to see if the stty process is in state "T". Also, it may make a difference if you are logged in already as foo or are using X. I first noticed this with no users logged in (except root) and no X running (but I can reproduce with X/xterm as well using this simple test case). It seems timing is a factor, so it's worth trying various things. -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 5:32 ` Willy Tarreau 2008-06-02 5:55 ` Joe Peterson @ 2008-06-02 8:10 ` Alan Cox 2008-06-02 9:01 ` David Newall 1 sibling, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-02 8:10 UTC (permalink / raw) To: Willy Tarreau; +Cc: Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox > Guys, you should test if "kill -CONT $pid" wakes the process up. > It might be possible that some obscure bug appeared in the tty > code resulting in SIGTTOU sometimes being sent to the caller, > although that seems rather strange :-/ Not really. The task would get suspended if it attempted to change the tty settings while not being session leader. This is part of the POSIX and BSD job control. A race (either kernel or in something like sshd/bash) would do that and could have been caused by any of the timing changes recently. That would also explain why I can't duplicate it, and the sleep observation. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 8:10 ` Alan Cox @ 2008-06-02 9:01 ` David Newall 2008-06-02 9:20 ` Alan Cox 0 siblings, 1 reply; 54+ messages in thread From: David Newall @ 2008-06-02 9:01 UTC (permalink / raw) To: Alan Cox; +Cc: Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox Alan Cox wrote: > Not really. The task would get suspended if it attempted to change the > tty settings while not being session leader. This is part of the POSIX > and BSD job control. I haven't heard about this new restriction, but it begs the observation that stty, when forked from a shell (the usual case), is never a session leader. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 9:01 ` David Newall @ 2008-06-02 9:20 ` Alan Cox 2008-06-02 10:16 ` Vegard Nossum 2008-06-02 15:26 ` Joe Peterson 0 siblings, 2 replies; 54+ messages in thread From: Alan Cox @ 2008-06-02 9:20 UTC (permalink / raw) To: David Newall Cc: Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox On Mon, 02 Jun 2008 18:31:34 +0930 David Newall <davidn@davidnewall.com> wrote: > Alan Cox wrote: > > Not really. The task would get suspended if it attempted to change the > > tty settings while not being session leader. This is part of the POSIX > > and BSD job control. > > I haven't heard about this new restriction, but it begs the observation > that stty, when forked from a shell (the usual case), is never a session > leader. Sorry I mean part of the current session. I was thinking about the specific case of bash or the ssh->bash setup where the question would be whether the shell was session leader. Someone who can dup this needs to instrument it in tty_ioctl really. Alan ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 9:20 ` Alan Cox @ 2008-06-02 10:16 ` Vegard Nossum 2008-06-02 10:39 ` Vegard Nossum 2008-06-02 10:50 ` Alan Cox 2008-06-02 15:26 ` Joe Peterson 1 sibling, 2 replies; 54+ messages in thread From: Vegard Nossum @ 2008-06-02 10:16 UTC (permalink / raw) To: Alan Cox Cc: David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox [-- Attachment #1: Type: text/plain, Size: 1825 bytes --] On Mon, Jun 2, 2008 at 11:20 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > On Mon, 02 Jun 2008 18:31:34 +0930 > David Newall <davidn@davidnewall.com> wrote: > >> Alan Cox wrote: >> > Not really. The task would get suspended if it attempted to change the >> > tty settings while not being session leader. This is part of the POSIX >> > and BSD job control. >> >> I haven't heard about this new restriction, but it begs the observation >> that stty, when forked from a shell (the usual case), is never a session >> leader. > > Sorry I mean part of the current session. I was thinking about the > specific case of bash or the ssh->bash setup where the question would be > whether the shell was session leader. > > Someone who can dup this needs to instrument it in tty_ioctl really. Hi, I have written a short test program that seems to reproduce it for me (see attachment), even though the original su/stty stuff wouldn't. Basically, the strace shows this: ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- ... (repeating) The exact code path triggering this seems to be: tcsetattr() -> ioctl(TCSETS) -> set_termios() -> tty_check_change() This is on a 2.6.24.5-85.fc8 kernel. I don't know what's wrong, but I hope this helps. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: reproduce.c --] [-- Type: text/x-csrc; name=reproduce.c, Size: 1077 bytes --] #include <sys/types.h> #include <sys/wait.h> #include <errno.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <termios.h> #include <unistd.h> int main(int argc, char *argv[]) { pid_t child; printf("pgid = %d\n", getpgrp()); child = fork(); if (child == 0) { struct termios termios_p; printf("forked, pgid = %d\n", getpgrp()); if (setpgrp() == -1) { printf("error: setpgid: %s\n", strerror(errno)); exit(EXIT_FAILURE); } printf("new pgid = %d\n", getpgrp()); if (tcgetattr(STDIN_FILENO, &termios_p) == -1) { printf("error: tcgetattr: %s\n", strerror(errno)); exit(EXIT_FAILURE); } if (tcsetattr(STDIN_FILENO, 0, &termios_p) == -1) { printf("error: tcsetattr: %s\n", strerror(errno)); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); } printf("forked, child = %d\n", child); while (1) { pid_t pid; int status; pid = wait(&status); if (pid == -1) { printf("error: wait: %s\n", strerror(errno)); exit(EXIT_FAILURE); } printf("pid %d status %d\n", pid, status); } return EXIT_SUCCESS; } ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:16 ` Vegard Nossum @ 2008-06-02 10:39 ` Vegard Nossum 2008-06-02 10:52 ` Alan Cox 2008-06-02 10:50 ` Alan Cox 1 sibling, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-06-02 10:39 UTC (permalink / raw) To: Alan Cox Cc: David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox On Mon, Jun 2, 2008 at 12:16 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > On Mon, Jun 2, 2008 at 11:20 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >> On Mon, 02 Jun 2008 18:31:34 +0930 >> David Newall <davidn@davidnewall.com> wrote: >> >>> Alan Cox wrote: >>> > Not really. The task would get suspended if it attempted to change the >>> > tty settings while not being session leader. This is part of the POSIX >>> > and BSD job control. >>> >>> I haven't heard about this new restriction, but it begs the observation >>> that stty, when forked from a shell (the usual case), is never a session >>> leader. >> >> Sorry I mean part of the current session. I was thinking about the >> specific case of bash or the ssh->bash setup where the question would be >> whether the shell was session leader. >> >> Someone who can dup this needs to instrument it in tty_ioctl really. > > Hi, > > I have written a short test program that seems to reproduce it for me > (see attachment), even though the original su/stty stuff wouldn't. > > Basically, the strace shows this: > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo > ...}) = ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo > ...}) = ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > ... (repeating) > > The exact code path triggering this seems to be: > > tcsetattr() -> ioctl(TCSETS) -> set_termios() -> tty_check_change() > > This is on a 2.6.24.5-85.fc8 kernel. > > I don't know what's wrong, but I hope this helps. The error seems that tty_check_change() returns -ERESTARTSYS. Shouldn't it be EINTR to allow the signal to be processed and let the process decide whether to retry the tcsetattr()? Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:39 ` Vegard Nossum @ 2008-06-02 10:52 ` Alan Cox 2008-06-02 10:57 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-02 10:52 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox On Mon, Jun 02, 2008 at 12:39:29PM +0200, Vegard Nossum wrote: > Shouldn't it be EINTR to allow the signal to be processed and let the > process decide whether to retry the tcsetattr()? The signal is processed, and then application retries the tcsetattr and gets another one. The default TTOU behaviour is to block and then fg continues the call so RESTARTSYS is both correct and has been used for years ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:52 ` Alan Cox @ 2008-06-02 10:57 ` Vegard Nossum 2008-06-02 12:28 ` Alan Cox 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-06-02 10:57 UTC (permalink / raw) To: Alan Cox Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel On Mon, Jun 2, 2008 at 12:52 PM, Alan Cox <alan@redhat.com> wrote: > On Mon, Jun 02, 2008 at 12:39:29PM +0200, Vegard Nossum wrote: >> Shouldn't it be EINTR to allow the signal to be processed and let the >> process decide whether to retry the tcsetattr()? > > The signal is processed, and then application retries the tcsetattr and > gets another one. The default TTOU behaviour is to block and then fg > continues the call so RESTARTSYS is both correct and has been used for > years > Hm, yes, that seems correct. I'm sorry for the wrong suggestions. I guess this still doesn't explain why TTOU doesn't block (IOW, stop the process, right?) in this case, because my test program does not touch it. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:57 ` Vegard Nossum @ 2008-06-02 12:28 ` Alan Cox 2008-06-02 14:31 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-02 12:28 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel On Mon, Jun 02, 2008 at 12:57:07PM +0200, Vegard Nossum wrote: > I guess this still doesn't explain why TTOU doesn't block (IOW, stop > the process, right?) in this case, because my test program does not > touch it. I see the parent process sleeping and the child taking TTOU and going to state T. That again is correct. alan 3219 0.0 0.0 3652 384 pts/5 S 13:11 0:00 ./repro alan 3220 0.0 0.0 3652 204 pts/5 T 13:11 0:00 ./repro If you run it without any straces etc do you see it blocked in T or sitting in R ? Alan ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 12:28 ` Alan Cox @ 2008-06-02 14:31 ` Vegard Nossum 0 siblings, 0 replies; 54+ messages in thread From: Vegard Nossum @ 2008-06-02 14:31 UTC (permalink / raw) To: Alan Cox Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel On Mon, Jun 2, 2008 at 2:28 PM, Alan Cox <alan@redhat.com> wrote: > On Mon, Jun 02, 2008 at 12:57:07PM +0200, Vegard Nossum wrote: >> I guess this still doesn't explain why TTOU doesn't block (IOW, stop >> the process, right?) in this case, because my test program does not >> touch it. > > I see the parent process sleeping and the child taking TTOU and going to > state T. That again is correct. > > alan 3219 0.0 0.0 3652 384 pts/5 S 13:11 0:00 ./repro > alan 3220 0.0 0.0 3652 204 pts/5 T 13:11 0:00 ./repro > > If you run it without any straces etc do you see it blocked in T or sitting > in R ? Without any straces, it is blocked in T. Like Joe's report. With strace, it's in R. Exactly as you said, correct and expected behaviour. So this is not a kernel problem at all. I'm sorry for having wasted your time :-( Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:16 ` Vegard Nossum 2008-06-02 10:39 ` Vegard Nossum @ 2008-06-02 10:50 ` Alan Cox 2008-06-17 15:32 ` Joe Peterson 1 sibling, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-02 10:50 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, Joe Peterson, linux-kernel, Alan Cox On Mon, Jun 02, 2008 at 12:16:56PM +0200, Vegard Nossum wrote: > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo > ...}) = ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo > ...}) = ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > ... (repeating) > > The exact code path triggering this seems to be: > > tcsetattr() -> ioctl(TCSETS) -> set_termios() -> tty_check_change() This looks correct to me and in fact I see the behaviour you report on 2.6.23 when running it. If I tell it to ignore SIGTTOU that also then behaves as expected. If your pgrp is not the pgrp of the tty and you are not ignoring TTOU and you are not orphaned (as a group) Then we are *supposed* to send you SIGTTOU and kick you back into touch. This is so that if you do someapp ^Z bg otherapp And someapp wants to change the tty settings it blocks back to the shell. This is correct behaviour and behaviour we've had for years. Alan ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 10:50 ` Alan Cox @ 2008-06-17 15:32 ` Joe Peterson 0 siblings, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-06-17 15:32 UTC (permalink / raw) To: Alan Cox Cc: Vegard Nossum, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Alan Cox wrote: > This looks correct to me and in fact I see the behaviour you report on 2.6.23 > when running it. If I tell it to ignore SIGTTOU that also then behaves as > expected. > > If > your pgrp is not the pgrp of the tty > and you are not ignoring TTOU > and you are not orphaned (as a group) > > Then we are *supposed* to send you SIGTTOU and kick you back > into touch. OK, I am still baffled. I've thought of several different theories, wondering if bash does not have the right parent process, how there could be a race in the kernel or elsewhere, but as far as I can tell, things are in order. Here's the ps -ax --forest output while hung: 6435 tty3 Ss 0:00 /bin/login -- 7954 tty3 S 0:00 \_ -bash 7958 tty3 S+ 0:00 \_ su foo 7959 tty3 S 0:00 \_ bash 7964 tty3 T 0:00 \_ stty -ixany I had logged into the tty as root (with shell set to bash), then su'd to foo (with shell set to bash), so this tree makes sense. During the sleep before the stty, sleep is under the final bash similar to the way stty is while it is hung. Note that the stty is a child of bash (which, BTW, sometimes appears as "-su" instead - I am not clear on that), and they all lead back to the original tty, which I gather is the session leader (or is it the "su"?). Now, the debugging I did shows that the reason that tty_check_change() returns an error is that the tty->pgrg != task_pgrp(current). The former is the "su foo" process, and the latter is the bash child process. So I guess that when it does work, they are the same process, but why would they be the same (or not, as it were)? Does something happen during bash startup that causes bash to become the session leader? Please, please, someone who understands the mechanics better than I let me know how I can explore this more deeply. Thanks, Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 9:20 ` Alan Cox 2008-06-02 10:16 ` Vegard Nossum @ 2008-06-02 15:26 ` Joe Peterson 2008-06-02 15:51 ` Alan Cox 1 sibling, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-02 15:26 UTC (permalink / raw) To: Alan Cox; +Cc: David Newall, Willy Tarreau, Harald Dunkel, linux-kernel, Alan Cox Alan Cox wrote: > Someone who can dup this needs to instrument it in tty_ioctl really. Alan, since I can get it to happen faithfully, I can try this - any suggestions on where to instrument? Thanks, Joe P.S. My stty process sits in "T" - did you say that it would be in "R" if straced and that is correct? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 15:26 ` Joe Peterson @ 2008-06-02 15:51 ` Alan Cox 2008-06-02 16:03 ` Joe Peterson 2008-06-04 14:43 ` Joe Peterson 0 siblings, 2 replies; 54+ messages in thread From: Alan Cox @ 2008-06-02 15:51 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel, Alan Cox On Mon, Jun 02, 2008 at 09:26:48AM -0600, Joe Peterson wrote: > P.S. My stty process sits in "T" - did you say that it would be in "R" > if straced and that is correct? T would be correct. I'll put together a small diff to printk useful stuff when it happens and sent it you tonight/tomorrow -- -- Take control of enterprise infrastructure Sign up for starfleet academy today ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 15:51 ` Alan Cox @ 2008-06-02 16:03 ` Joe Peterson 2008-06-04 14:43 ` Joe Peterson 1 sibling, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-06-02 16:03 UTC (permalink / raw) To: Alan Cox; +Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Alan Cox wrote: > On Mon, Jun 02, 2008 at 09:26:48AM -0600, Joe Peterson wrote: >> P.S. My stty process sits in "T" - did you say that it would be in "R" >> if straced and that is correct? > > T would be correct. I'll put together a small diff to printk useful stuff > when it happens and sent it you tonight/tomorrow Awesome; that would be great - thanks! -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 15:51 ` Alan Cox 2008-06-02 16:03 ` Joe Peterson @ 2008-06-04 14:43 ` Joe Peterson 2008-06-04 15:16 ` Alan Cox 1 sibling, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-04 14:43 UTC (permalink / raw) To: Alan Cox; +Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Alan Cox wrote: > On Mon, Jun 02, 2008 at 09:26:48AM -0600, Joe Peterson wrote: >> P.S. My stty process sits in "T" - did you say that it would be in "R" >> if straced and that is correct? > > T would be correct. I'll put together a small diff to printk useful stuff > when it happens and sent it you tonight/tomorrow [Alan, thanks for the tips on where to instrument this] What I have verified so far is that when the problem occurs, it gets to this point in [tty_io.c] tty_check_change(): 1229 kill_pgrp(task_pgrp(current), SIGTTOU, 1); 1230 set_thread_flag(TIF_SIGPENDING); 1231 ret = -ERESTARTSYS; 1232 out: 1233 return ret; So the error that gets returned to set_termios() is -512. Also, the various checks before this point (of course) did not pass (current->signal->tty != tty, !tty->pgrp, task_pgrp(current) == tty->pgrp, is_ignored(SIGTTOU), is_current_pgrp_orphaned()). I have not printed out the various values from these - let me know if this would be helpful. I wanted to pass this info along now in case it is of help. -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-04 14:43 ` Joe Peterson @ 2008-06-04 15:16 ` Alan Cox 2008-06-04 16:52 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-04 15:16 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Wed, Jun 04, 2008 at 08:43:00AM -0600, Joe Peterson wrote: > > So the error that gets returned to set_termios() is -512. > > Also, the various checks before this point (of course) did not pass > (current->signal->tty != tty, !tty->pgrp, task_pgrp(current) == > tty->pgrp, is_ignored(SIGTTOU), is_current_pgrp_orphaned()). I have not > printed out the various values from these - let me know if this would be > helpful. I wanted to pass this info along now in case it is of help. See what tty->pgrp is at that point when it hangs and that might identify who is owning the tty and tty setup ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-04 15:16 ` Alan Cox @ 2008-06-04 16:52 ` Joe Peterson 2008-06-04 17:10 ` Alan Cox 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-04 16:52 UTC (permalink / raw) To: Alan Cox; +Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Alan Cox wrote: > See what tty->pgrp is at that point when it hangs and that might identify > who is owning the tty and tty setup tty = current->signal->tty = -142080000 or 0xf7880800 task->pgrg = -142405824 or 0xf7830f40 -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-04 16:52 ` Joe Peterson @ 2008-06-04 17:10 ` Alan Cox 2008-06-04 20:32 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-06-04 17:10 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel > tty = current->signal->tty = -142080000 or 0xf7880800 > task->pgrg = -142405824 or 0xf7830f40 task->pgrp is a struct pid - you need the value it holds ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-04 17:10 ` Alan Cox @ 2008-06-04 20:32 ` Joe Peterson 2008-06-11 14:04 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-04 20:32 UTC (permalink / raw) To: Alan Cox; +Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Alan Cox wrote: >> tty = current->signal->tty = -142080000 or 0xf7880800 >> task->pgrg = -142405824 or 0xf7830f40 > > task->pgrp is a struct pid - you need the value it holds Yeah, I figured later that giving you the addresses was rather useless. :) Anyway, here is more info: tty_check_change: current->signal->tty = f7880800 tty_check_change: tty = f7880800 tty_check_change: tty->pgrp = f7b99e40 tty->pgrp->count = 5 tty->pgrp->level = 0 tty->pgrp->numbers[0].nr = 6951 tty_check_change: task_pgrp(current) = f7b99d40 task_pgrp(current)->count = 1 task_pgrp(current)->level = 0 task_pgrp(current)->numbers[0].nr = 6952 tty_check_change: kill_pgrp called; returning -ERESTARTSYS set_termios: error return value (-512) from tty_check_change foo 6951 0.0 0.1 2332 1096 tty1 S+ 14:18 0:00 su foo foo 6952 0.0 0.1 2988 1464 tty1 S 14:18 0:00 bash So, looks like the tty->pgrp's process is the "su" command itself, and the task_pgrp(current)'s process is "bash" - the shell started by the su. -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-04 20:32 ` Joe Peterson @ 2008-06-11 14:04 ` Joe Peterson 2008-06-12 11:52 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-11 14:04 UTC (permalink / raw) To: Alan Cox; +Cc: Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Joe Peterson wrote: > Anyway, here is more info: > > tty_check_change: current->signal->tty = f7880800 > tty_check_change: tty = f7880800 > tty_check_change: tty->pgrp = f7b99e40 > tty->pgrp->count = 5 > tty->pgrp->level = 0 > tty->pgrp->numbers[0].nr = 6951 > tty_check_change: task_pgrp(current) = f7b99d40 > task_pgrp(current)->count = 1 > task_pgrp(current)->level = 0 > task_pgrp(current)->numbers[0].nr = 6952 > tty_check_change: kill_pgrp called; returning -ERESTARTSYS > set_termios: error return value (-512) from tty_check_change > foo 6951 0.0 0.1 2332 1096 tty1 S+ 14:18 0:00 su foo > foo 6952 0.0 0.1 2988 1464 tty1 S 14:18 0:00 bash > > > So, looks like the tty->pgrp's process is the "su" command itself, and > the task_pgrp(current)'s process is "bash" - the shell started by the su. If anyone has any tips for my further debugging of this, given the above, let me know. I'd like to help resolve this. Thanks! Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-11 14:04 ` Joe Peterson @ 2008-06-12 11:52 ` Vegard Nossum 2008-06-14 1:49 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-06-12 11:52 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Wed, Jun 11, 2008 at 4:04 PM, Joe Peterson <joe@skyrush.com> wrote: > Joe Peterson wrote: >> Anyway, here is more info: >> >> tty_check_change: current->signal->tty = f7880800 >> tty_check_change: tty = f7880800 >> tty_check_change: tty->pgrp = f7b99e40 >> tty->pgrp->count = 5 >> tty->pgrp->level = 0 >> tty->pgrp->numbers[0].nr = 6951 >> tty_check_change: task_pgrp(current) = f7b99d40 >> task_pgrp(current)->count = 1 >> task_pgrp(current)->level = 0 >> task_pgrp(current)->numbers[0].nr = 6952 >> tty_check_change: kill_pgrp called; returning -ERESTARTSYS >> set_termios: error return value (-512) from tty_check_change >> foo 6951 0.0 0.1 2332 1096 tty1 S+ 14:18 0:00 su foo >> foo 6952 0.0 0.1 2988 1464 tty1 S 14:18 0:00 bash >> >> >> So, looks like the tty->pgrp's process is the "su" command itself, and >> the task_pgrp(current)'s process is "bash" - the shell started by the su. > > If anyone has any tips for my further debugging of this, given the > above, let me know. I'd like to help resolve this. I think knowing the pgrps of the above processes (there is possibly one more involved, stty?) would be useful; try: $ ps -eo pid,pgrp,tpgid,user,args ..as this problem occurs because a process tries to change the terminal settings (and subsequently gets suspended because of that) while it's not the owner of the terminal. This can happen if you fork something off to the background, e.g. like $ stty 9600 & (which should immediately give you [1]+ Stopped stty 9600), so can you please look for anything like that in your login scripts or shell rc files? I don't know any other way to debug this further, sorry :-( Thanks. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-12 11:52 ` Vegard Nossum @ 2008-06-14 1:49 ` Joe Peterson 2008-06-14 7:45 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-14 1:49 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Vegard Nossum wrote: > I think knowing the pgrps of the above processes (there is possibly > one more involved, stty?) would be useful; try: > > $ ps -eo pid,pgrp,tpgid,user,args OK, I performed this test again (getting the su to hang), and here is the info: tty_check_change: current->signal->tty = f7879800 tty_check_change: tty = f7879800 tty_check_change: tty->pgrp = f78639c0 tty->pgrp->count = 5 tty->pgrp->level = 0 tty->pgrp->numbers[0].nr = 7036 tty_check_change: task_pgrp(current) = f7863f00 task_pgrp(current)->count = 1 task_pgrp(current)->level = 0 task_pgrp(current)->numbers[0].nr = 7037 tty_check_change: kill_pgrp called; returning -ERESTARTSYS set_termios: error return value (-512) from tty_check_change scorpius ~ # ps aux | grep 7036 foo 7036 0.0 0.1 2336 1100 tty1 S+ 19:30 0:00 su foo scorpius ~ # ps aux | grep 7037 foo 7037 0.0 0.1 2988 1460 tty1 S 19:30 0:00 bash scorpius ~ # ps -eo pid,pgrp,tpgid,user,args | grep 7036 6902 6902 7036 root /bin/login -- 6922 6922 7036 root -bash 7036 7036 7036 foo su foo 7037 7037 7036 foo bash 7042 7037 7036 foo stty -ixany scorpius ~ # ps -eo pid,pgrp,tpgid,user,args | grep 7037 7037 7037 7036 foo bash 7042 7037 7036 foo stty -ixany scorpius ~ # ps aux | grep 7042 foo 7042 0.0 0.0 1608 376 tty1 T 19:30 0:00 stty -ixany scorpius ~ # ps -eo pid,pgrp,tpgid,user,args | grep 7042 7042 7037 7036 foo stty -ixany (I omitted, of course, when grep found itself, and I compressed some white space to allow lines to fit nicely in the email) > ..as this problem occurs because a process tries to change the > terminal settings (and subsequently gets suspended because of that) > while it's not the owner of the terminal. > > This can happen if you fork something off to the background, e.g. like > > $ stty 9600 & > > (which should immediately give you [1]+ Stopped stty 9600), > > so can you please look for anything like that in your login scripts or > shell rc files? I do use stty in my .bashrc (that's why this happens), but I do not put it in the background. Anyway, hope the additional info above is of help... Thanks, Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 1:49 ` Joe Peterson @ 2008-06-14 7:45 ` Vegard Nossum 2008-06-14 17:43 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-06-14 7:45 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Sat, Jun 14, 2008 at 3:49 AM, Joe Peterson <joe@skyrush.com> wrote: > Vegard Nossum wrote: >> I think knowing the pgrps of the above processes (there is possibly >> one more involved, stty?) would be useful; try: >> >> $ ps -eo pid,pgrp,tpgid,user,args > > OK, I performed this test again (getting the su to hang), and here is > the info: [snip] > scorpius ~ # ps -eo pid,pgrp,tpgid,user,args | grep 7036 > 6902 6902 7036 root /bin/login -- > 6922 6922 7036 root -bash > 7036 7036 7036 foo su foo > 7037 7037 7036 foo bash > 7042 7037 7036 foo stty -ixany So this clearly shows what's wrong; 7036 is the "controlling process" group id. But only "su foo" is in this group, the bash and stty processes have their own group, 7037. On my own system, when I do "su", I get this: 2891 2891 2892 root su temp 2892 2892 2892 temp bash ...and here the "bash" process is in the right group, 2892, while "su" is the one in the background! Can you try to run strace on the su to see where things go wrong, i.e. $ strace -f -e trace=process su foo ...and we're only interested in what happens up to the point where it hangs. That should hopefully tell us which process is doing the wrong thing. In either case, as Alan pointed out, this seems unlikely to be a kernel problem. [snip] >> so can you please look for anything like that in your login scripts or >> shell rc files? > > I do use stty in my .bashrc (that's why this happens), but I do not put > it in the background. Yeah, most likely the process that calls stty is first put in the background itself (or never brought to the foreground?). But I don't know why... when you get the trace, we can compare and find out where it deviates. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 7:45 ` Vegard Nossum @ 2008-06-14 17:43 ` Joe Peterson 2008-06-14 20:34 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-14 17:43 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1468 bytes --] Vegard Nossum wrote: > So this clearly shows what's wrong; 7036 is the "controlling process" > group id. But only "su foo" is in this group, the bash and stty > processes have their own group, 7037. > > On my own system, when I do "su", I get this: > 2891 2891 2892 root su temp > 2892 2892 2892 temp bash > > ...and here the "bash" process is in the right group, 2892, while "su" > is the one in the background! Hmm. > Can you try to run strace on the su to see where things go wrong, i.e. > > $ strace -f -e trace=process su foo > > ...and we're only interested in what happens up to the point where it > hangs. That should hopefully tell us which process is doing the wrong > thing. In either case, as Alan pointed out, this seems unlikely to be > a kernel problem. OK, I attached this as a text file at the end. But (*bummer*), using strace makes it impossible to reproduce the hang (figures, and I believe someone earlier in the thread also had this problem). As for whether the kernel is at fault, not sure (i.e. does this hang behavior implicate the kernel automatically or can a user-space process cause itself such an issue?). But I *do* see different behavior depending on the kernel version. There were a couple of git kernels in which I could not reproduce it. Still, if it is a race or something, it might be that the conditions were just slightly perturbed. I attached the strace log just in case it is of help. -Joe [-- Attachment #2: su_strace.log --] [-- Type: text/x-log, Size: 2501 bytes --] 7009 execve("/bin/su", ["su", "foo"], [/* 32 vars */]) = 0 7009 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7e3d708) = 7010 7010 execve("/bin/bash", ["bash"], [/* 31 vars */]) = 0 7010 clone( <unfinished ...> 7009 waitpid(-1, <unfinished ...> 7010 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7db0708) = 7011 7011 exit_group(0) = ? 7010 --- SIGCHLD (Child exited) @ 0 (0) --- 7010 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 7011 7010 waitpid(-1, 0xbff58cec, WNOHANG) = -1 ECHILD (No child processes) 7010 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7db0708) = 7012 7012 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7db0708) = 7013 7013 execve("/usr/bin/dircolors", ["dircolors", "-b", "/etc/DIR_COLORS"], [/* 31 vars */]) = 0 7013 exit_group(0) = ? 7012 --- SIGCHLD (Child exited) @ 0 (0) --- 7012 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 7013 7012 waitpid(-1, 0xbff585ec, WNOHANG) = -1 ECHILD (No child processes) 7012 exit_group(0) = ? 7010 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 7012 7010 --- SIGCHLD (Child exited) @ 0 (0) --- 7010 waitpid(-1, 0xbff5873c, WNOHANG) = -1 ECHILD (No child processes) 7010 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7db0708) = 7014 7014 execve("/bin/sleep", ["sleep", "2"], [/* 31 vars */]) = 0 7010 waitpid(-1, <unfinished ...> 7014 exit_group(0) = ? 7010 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 7014 7010 --- SIGCHLD (Child exited) @ 0 (0) --- 7010 waitpid(-1, 0xbff593dc, WNOHANG) = -1 ECHILD (No child processes) 7010 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7db0708) = 7015 7015 execve("/bin/stty", ["stty", "-ixany"], [/* 31 vars */]) = 0 7015 exit_group(0) = ? 7010 --- SIGCHLD (Child exited) @ 0 (0) --- 7010 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 7015 7010 waitpid(-1, 0xbff5936c, WNOHANG) = -1 ECHILD (No child processes) 7010 exit_group(0) = ? 7009 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED) = 7010 7009 exit_group(0) = ? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 17:43 ` Joe Peterson @ 2008-06-14 20:34 ` Vegard Nossum 2008-06-14 20:52 ` Joe Peterson 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-06-14 20:34 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Sat, Jun 14, 2008 at 7:43 PM, Joe Peterson <joe@skyrush.com> wrote: >> Can you try to run strace on the su to see where things go wrong, i.e. >> >> $ strace -f -e trace=process su foo >> >> ...and we're only interested in what happens up to the point where it >> hangs. That should hopefully tell us which process is doing the wrong >> thing. In either case, as Alan pointed out, this seems unlikely to be >> a kernel problem. > > OK, I attached this as a text file at the end. But (*bummer*), using > strace makes it impossible to reproduce the hang (figures, and I believe > someone earlier in the thread also had this problem). Yeah, but doesn't it loop indefinitely calling ioctl() and getting a SIGTTOU? Tracing up till this point is okay (and what I had in mind). > > As for whether the kernel is at fault, not sure (i.e. does this hang > behavior implicate the kernel automatically or can a user-space process > cause itself such an issue?). But I *do* see different behavior > depending on the kernel version. There were a couple of git kernels in > which I could not reproduce it. Still, if it is a race or something, it > might be that the conditions were just slightly perturbed. Yeah, a user-space process can do this, and it's the right behaviour for the kernel. I did post a program that would "reproduce" what you're seeing. I do now believe that it's something timing-related, as Alan suggested initially. (But timing-related with your scripts, that is. I must say, that "sleep 2" does look a bit suspicious; I have no idea what that is supposed to do :-)) I suppose it would be more useful to see a trace where you include a few more system calls, can you try: # strace -e trace=process,ioctl,setpgid -f su foo instead? Just for the record, I'm probably not the best person to debug this, so I'm just trying to figure it out as we go. On the other hand, I don't see better suggestions from anybody else. Thank you for persisting, though! :-) (And the fact that the results differ with the kernel versions does make this relevant for LKML still.) Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 20:34 ` Vegard Nossum @ 2008-06-14 20:52 ` Joe Peterson 2008-06-14 21:26 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Joe Peterson @ 2008-06-14 20:52 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2153 bytes --] Vegard Nossum wrote: > Yeah, a user-space process can do this, and it's the right behaviour > for the kernel. I did post a program that would "reproduce" what > you're seeing. I do now believe that it's something timing-related, as > Alan suggested initially. (But timing-related with your scripts, that > is. I must say, that "sleep 2" does look a bit suspicious; I have no > idea what that is supposed to do :-)) Ah, that is something I put in there to artificially make it more reproducible. Here's the reason: when I first encountered the problem, it was happening if the home dir of the user was on the "btrfs" filesystem (the new checksumming one from Oracle). This made me suspect btrfs initially. But I reproduced the problem [more sporadically] when the home was on ext3 as well. Since btrfs has a different performance profile, especially when first accessed after a mount (and it is a filesystem still under development, so some optimizations are yet to come), I figured it might be timing-related, and sure enough, adding the "sleep 2" proved that. So without the sleep 2 and with a home of ext3, it rarely happens, since it takes very little time to read the homedir files (.bashrc, etc.). Putting in the sleep makes it almost always happen. It seems like the delay invoked by the sleep causes that subsequent stty call to hang. > I suppose it would be more useful to see a trace where you include a > few more system calls, can you try: > > # strace -e trace=process,ioctl,setpgid -f su foo > > instead? OK, attached. > Just for the record, I'm probably not the best person to debug this, > so I'm just trying to figure it out as we go. On the other hand, I > don't see better suggestions from anybody else. Thank you for > persisting, though! :-) > > (And the fact that the results differ with the kernel versions does > make this relevant for LKML still.) Thanks for helping. Yes, this is the kind of nagging issue that really bugs me, since it is intermittent and makes things feel unstable. If we determine the problem is in something else (like stty or bash), then at least I can file a bug with them. -Joe [-- Attachment #2: strace_su.log --] [-- Type: text/x-log, Size: 5681 bytes --] 9738 execve("/bin/su", ["su", "foo"], [/* 50 vars */]) = 0 9738 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9738 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9738 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9738 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7dfc708) = 9740 9738 waitpid(-1, <unfinished ...> 9740 execve("/bin/bash", ["bash"], [/* 50 vars */]) = 0 9740 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCGPGRP, [9737]) = 0 9740 setpgid(0, 9740) = 0 9740 ioctl(255, TIOCSPGRP, [9740]) = 0 9740 ioctl(255, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7df1708) = 9741 9741 exit_group(0) = ? 9740 --- SIGCHLD (Child exited) @ 0 (0) --- 9740 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 9741 9740 waitpid(-1, 0xbfb187fc, WNOHANG) = -1 ECHILD (No child processes) 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfb18774) = -1 ENOTTY (Inappropriate ioctl for device) 9740 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7df1708) = 9742 9742 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7df1708) = 9743 9742 waitpid(-1, <unfinished ...> 9743 execve("/usr/bin/dircolors", ["dircolors", "-b", "/etc/DIR_COLORS"], [/* 49 vars */]) = 0 9743 exit_group(0) = ? 9742 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 9743 9742 --- SIGCHLD (Child exited) @ 0 (0) --- 9742 waitpid(-1, 0xbfb17fbc, WNOHANG) = -1 ECHILD (No child processes) 9742 exit_group(0) = ? 9740 --- SIGCHLD (Child exited) @ 0 (0) --- 9740 waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG) = 9742 9740 waitpid(-1, 0xbfb1824c, WNOHANG) = -1 ECHILD (No child processes) 9740 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7df1708) = 9744 9740 waitpid(-1, <unfinished ...> 9744 execve("/bin/sleep", ["sleep", "2"], [/* 49 vars */]) = 0 9744 exit_group(0) = ? 9740 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 9744 9740 ioctl(255, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 --- SIGCHLD (Child exited) @ 0 (0) --- 9740 waitpid(-1, 0xbfb18d3c, WNOHANG) = -1 ECHILD (No child processes) 9740 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7df1708) = 9745 9740 waitpid(-1, <unfinished ...> 9745 execve("/bin/stty", ["stty", "-ixany"], [/* 49 vars */]) = 0 9745 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9745 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = 0 9745 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9745 exit_group(0) = ? 9740 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 9745 9740 ioctl(255, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 --- SIGCHLD (Child exited) @ 0 (0) --- 9740 waitpid(-1, 0xbfb18d3c, WNOHANG) = -1 ECHILD (No child processes) 9740 ioctl(255, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(1, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, TIOCSWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCSPGRP, [9740]) = 0 9740 ioctl(0, TIOCGWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, TIOCSWINSZ, {ws_row=30, ws_col=80, ws_xpixel=724, ws_ypixel=454}) = 0 9740 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig -icanon -echo ...}) = 0 9740 ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = 0 9740 ioctl(255, TIOCSPGRP, [9737]) = 0 9740 setpgid(0, 9737) = 0 9740 exit_group(0) = ? 9738 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED) = 9740 9738 exit_group(0) = ? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 20:52 ` Joe Peterson @ 2008-06-14 21:26 ` Vegard Nossum 2008-06-14 21:34 ` Joe Peterson 2008-07-02 18:03 ` tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) Joe Peterson 0 siblings, 2 replies; 54+ messages in thread From: Vegard Nossum @ 2008-06-14 21:26 UTC (permalink / raw) To: Joe Peterson Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Sat, Jun 14, 2008 at 10:52 PM, Joe Peterson <joe@skyrush.com> wrote: > Vegard Nossum wrote: >> Yeah, a user-space process can do this, and it's the right behaviour >> for the kernel. I did post a program that would "reproduce" what >> you're seeing. I do now believe that it's something timing-related, as >> Alan suggested initially. (But timing-related with your scripts, that >> is. I must say, that "sleep 2" does look a bit suspicious; I have no >> idea what that is supposed to do :-)) > > Ah, that is something I put in there to artificially make it more > reproducible. Here's the reason: when I first encountered the problem, > it was happening if the home dir of the user was on the "btrfs" > filesystem (the new checksumming one from Oracle). This made me suspect > btrfs initially. But I reproduced the problem [more sporadically] when > the home was on ext3 as well. Since btrfs has a different performance > profile, especially when first accessed after a mount (and it is a > filesystem still under development, so some optimizations are yet to > come), I figured it might be timing-related, and sure enough, adding the > "sleep 2" proved that. I'm not sure it is. Try adding sleep 3 instead. Because I have the "sleep 2" when I run "su foo" as well, and I _didn't_ put it there: [pid 6298] execve("/bin/sleep", ["sleep", "2"], [/* 47 vars */] <unfinished ...> Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-14 21:26 ` Vegard Nossum @ 2008-06-14 21:34 ` Joe Peterson 2008-07-02 18:03 ` tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) Joe Peterson 1 sibling, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-06-14 21:34 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel Vegard Nossum wrote: > I'm not sure it is. Try adding sleep 3 instead. Because I have the > "sleep 2" when I run "su foo" as well, and I _didn't_ put it there: > > [pid 6298] execve("/bin/sleep", ["sleep", "2"], [/* 47 vars */] > <unfinished ...> Weird! OK, I tried it with "sleep 3" in .bashrc, and it says "...execve("/usr/bin/sleep", ["sleep", "3"], [/* 30 vars */]) = 0". This sounds like what I'd expect. I don't understand why you see a sleep 2 when you did not have one in your config..... -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) 2008-06-14 21:26 ` Vegard Nossum 2008-06-14 21:34 ` Joe Peterson @ 2008-07-02 18:03 ` Joe Peterson 2008-07-02 19:21 ` markus reichelt 2008-07-06 14:08 ` Tim Connors 1 sibling, 2 replies; 54+ messages in thread From: Joe Peterson @ 2008-07-02 18:03 UTC (permalink / raw) To: Vegard Nossum Cc: Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel, tconnors I have done some more investigation on this problem, and I am posting here my results in hope that someone can point me in the right direction for further investigation... Summary: during the initialization of a new bash shell, the terminal foreground process group often reverts back to that of the parent of the bash shell (after being set *to* the bash shell pgrp by bash), prohibiting commands like stty from being run by the init scripts. The result is that the execution of these commands will hang until killed, causing the bash prompt to not appear. Adding a delay in the script (using sleep) increases the chance of this having time to happen. For example, putting the following in a user's .bashrc: sleep 2 stty -ixany is a good way to reproduce this. doing "su <user>" from root (note that the fact that no password is required helps the timing) will then often hang. Killing -9 stty will allow the bash prompt to appear. I have instrumented the bash source code in an attempt to see why this is happeneing, partly because I suspected a bug in bash. What I have found is this: 1) bash calls tcsetpgrp() with the pgrp of the bash process (two times) before starting to execute init scripts. This makes sense, since bash needs to be the session leader. It is never called again until just before the bash shell exits normally (at which time it returns control to the parent). 2) During the processing of the init scripts (sometimes .bashrc, but sometimes a system script that is processed first), calling tcgetpgrp() shows that the pgrp has reverted back to the "su <user>" process. It does not appear that bash reverted it in my testing so far. Running stty while in the reverted state causes a hang, since bash is not the session leader. So here is the question: is there a way/reason the kernel would revert the pgrp of the session leader after bash sets it? Is there some more instrumenting in the kernel or in bash that might reveal what is going on? I have heard yet another report of this happening since I added to the thread, and I can get it to happen easily on two different machines (a desktop and a laptop). Thanks, Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) 2008-07-02 18:03 ` tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) Joe Peterson @ 2008-07-02 19:21 ` markus reichelt 2008-07-06 14:08 ` Tim Connors 1 sibling, 0 replies; 54+ messages in thread From: markus reichelt @ 2008-07-02 19:21 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 316 bytes --] * Joe Peterson <joe@skyrush.com> wrote: > I have done some more investigation on this problem, and I am > posting here my results in hope that someone can point me in the > right direction for further investigation... I cannot reproduce this with 2.6.25.9 (on Slackware 12.0) -- left blank, right bald [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) 2008-07-02 18:03 ` tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) Joe Peterson 2008-07-02 19:21 ` markus reichelt @ 2008-07-06 14:08 ` Tim Connors 2008-07-06 16:44 ` Alan Cox 2008-07-06 18:49 ` tty session leader issue [cause now known!] " Joe Peterson 1 sibling, 2 replies; 54+ messages in thread From: Tim Connors @ 2008-07-06 14:08 UTC (permalink / raw) To: Joe Peterson Cc: Vegard Nossum, Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Wed, 2 Jul 2008, Joe Peterson wrote: > I have done some more investigation on this problem, and I am posting > here my results in hope that someone can point me in the right direction > for further investigation... > > Summary: during the initialization of a new bash shell, the terminal > foreground process group often reverts back to that of the parent of the > bash shell (after being set *to* the bash shell pgrp by bash), > prohibiting commands like stty from being run by the init scripts. The > result is that the execution of these commands will hang until killed, > causing the bash prompt to not appear. Adding a delay in the script > (using sleep) increases the chance of this having time to happen. ... > So here is the question: is there a way/reason the kernel would revert > the pgrp of the session leader after bash sets it? Is there some more > instrumenting in the kernel or in bash that might reveal what is going > on? I have heard yet another report of this happening since I added to > the thread, and I can get it to happen easily on two different machines > (a desktop and a laptop). In fact, in various laptops (Eeeepc, dell inspiron 1520, Dell inspiron 4000), I've got various tty screwups that have been introduced since circa 2.6.19. The 6 year old inspiron 4000 gets stuck at stty erase ^? . Randomly, but most of the time. All of my machines exhibit the ctrl-C being slower than ctrl-Z discussed elswhere (I've almost developed a habit of typing ctrl-Z kill %1 <RET>). Although even ctrl-Z recently has been reluctant to always work. I wonder if this is the cause of dpkg recently not responding to ctrl-Z's? (debian bug #486222). dpkg does respond to kill -STOP ctrl-s doesn't always work anymore. Again, what prompted me to write this email, was I couldn't pause dpkg. It's particularly unreliable at stopping scrolling messages at bootup, and if I press it at the wrong time at bootup (not a specific place - it can be starting up any number of scripts), something deadlocks and won't resume upon a ctrl-q. alt-sysrq-k is enough to kill whatever has deadlocked. I have a feeling, but don't want to test on this system right now, that pressing scroll-lock as opposed to ctrl-q once unlocked such a stuck display. In summary, something in tty is certainly screwed. Does anyone see a connection between all of these? -- TimC > cat ~/.signature Electromagnetic pulse received (core dumped) ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) 2008-07-06 14:08 ` Tim Connors @ 2008-07-06 16:44 ` Alan Cox 2008-07-06 18:49 ` tty session leader issue [cause now known!] " Joe Peterson 1 sibling, 0 replies; 54+ messages in thread From: Alan Cox @ 2008-07-06 16:44 UTC (permalink / raw) To: Tim Connors Cc: Joe Peterson, Vegard Nossum, Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel On Mon, Jul 07, 2008 at 12:08:58AM +1000, Tim Connors wrote: > In summary, something in tty is certainly screwed. Does anyone see a > connection between all of these? That they don't happen for me - at all is the only one I can suggest ? Most of your comments are also not ones I've seen reported before. Unfortunately 'works for me' doesn't tell me whether that is luck, distribution specific, user configuration choices, gcc version, bugs in code , or whatever and someone who sees the ^C problem is going to have to track it down. Alan ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: tty session leader issue [cause now known!] (was Re: 2.6.25.3: su gets stuck for root) 2008-07-06 14:08 ` Tim Connors 2008-07-06 16:44 ` Alan Cox @ 2008-07-06 18:49 ` Joe Peterson 1 sibling, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-07-06 18:49 UTC (permalink / raw) To: Tim Connors Cc: Vegard Nossum, Alan Cox, Alan Cox, David Newall, Willy Tarreau, Harald Dunkel, linux-kernel, Ingo Molnar Tim Connors wrote: > On Wed, 2 Jul 2008, Joe Peterson wrote: > >> I have done some more investigation on this problem, and I am posting >> here my results in hope that someone can point me in the right direction >> for further investigation... >> >> Summary: during the initialization of a new bash shell, the terminal >> foreground process group often reverts back to that of the parent of the >> bash shell (after being set *to* the bash shell pgrp by bash), >> prohibiting commands like stty from being run by the init scripts. The >> result is that the execution of these commands will hang until killed, >> causing the bash prompt to not appear. Adding a delay in the script >> (using sleep) increases the chance of this having time to happen. I have done more investigation, and I now know the cause of the bash/stty problem. It appears to be a race condition in bash (well, between two different bash shells, actually). I saw a post from a while back about something similar by Ingo Molnar, so I have copied him here too. Here is the ps tree of the test case where stty has hung: 4704 ? S 0:00 \_ xterm 4706 pts/3 Ss 0:00 | \_ -bash 4739 pts/3 S 0:00 | \_ su 4742 pts/3 S 0:00 | \_ bash 4746 pts/3 S+ 0:00 | \_ su foo 4747 pts/3 S 0:00 | \_ bash 4752 pts/3 T 0:00 | \_ stty -ixany What should happen is: when "su foo" (4746) is run, it spawns a bash shell (4747) that then makes itself the session leader when it initializes its job control. The stty command (in the child bash's .bashrc) will then be able to work (and not hang). However, the hang happens when the parent bash (4742) interferes by reverting the tty session leader back to its child (the "su foo" process: 4746) shortly after the child bash (4747) becomes the leader. The parent does this when it calls execute_command_internal()->stop_pipeline()->give_terminal_to(). This seems to happen at a slightly random time, making the issue intermittent - it depends which one wins the race. In summary, when the bug does *not* occur, here is the approximate sequence (note I am : 1) parent bash (4742) runs 'su foo' (4746) 2) parent bash sets tty leader to 'su' (4746) 3) child bash (4747) initializes and sets itself to be the leader 4) stty command in .bashrc runs successfully When the bug occurs, here is the sequence: 1) parent bash (4742) runs 'su foo' (4746) 2) child bash (4747) initializes and sets itself to be the leader 3) parent bash sets tty leader *back* to 'su' (4746) 4) stty command runs and fails/hangs because its parent is not leader The various calls to tcsetpgrp() that do this are interleaved from the two bash processes, and sometimes the parent does it slightly *after* the child bash initializes job control - that's when the problem happens. I have not looked further to find a solution (but it's a great start to know the cause...!). Any further help is welcome. > The 6 year old inspiron 4000 gets stuck at stty erase ^? . Randomly, but > most of the time. > > All of my machines exhibit the ctrl-C being slower than ctrl-Z discussed > elswhere (I've almost developed a habit of typing ctrl-Z kill %1 <RET>). > Although even ctrl-Z recently has been reluctant to always work. I wonder > if this is the cause of dpkg recently not responding to ctrl-Z's? (debian > bug #486222). dpkg does respond to kill -STOP I doubt that this is related. See the following thread for more info on this: http://marc.info/?l=linux-kernel&m=121528829718840&w=2 > ctrl-s doesn't always work anymore. Again, what prompted me to write this > email, was I couldn't pause dpkg. It's particularly unreliable at > stopping scrolling messages at bootup, and if I press it at the wrong time > at bootup (not a specific place - it can be starting up any number of > scripts), something deadlocks and won't resume upon a ctrl-q. > alt-sysrq-k is enough to kill whatever has deadlocked. I have a feeling, > but don't want to test on this system right now, that pressing scroll-lock > as opposed to ctrl-q once unlocked such a stuck display. Hmm, not sure; I have not seen that behavior. > In summary, something in tty is certainly screwed. Does anyone see a > connection between all of these? I doubt there is a connection between the bash issue and what you are seeing with ctrl-C/ctrl-S, etc. -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-06-02 5:12 ` Harald Dunkel 2008-06-02 5:32 ` Willy Tarreau @ 2008-06-02 5:42 ` Joe Peterson 1 sibling, 0 replies; 54+ messages in thread From: Joe Peterson @ 2008-06-02 5:42 UTC (permalink / raw) To: Harald Dunkel; +Cc: linux-kernel, Alan Cox Harald Dunkel wrote: > Joe Peterson wrote: >> Hi Harold, >> >> Doing "ps" while hung shows stty in the "T" state. "killall -9 stty" >> releases it. >> > > Does strace give you the same output if you attach it to the blocking > stty (strace -p $pid)? > > I got > > > : > ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- Yep, almost the same. I get (repeating): ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- -Joe ^ permalink raw reply [flat|nested] 54+ messages in thread
* 2.6.25.3: su gets stuck for root @ 2008-05-13 6:17 Harald Dunkel 2008-05-13 6:47 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-13 6:17 UTC (permalink / raw) To: linux-kernel Hi folks, I haven't seen it mentioned here (hopefully I wasn't too blind to see): If I run "su someuser" as root, then it gets stuck. No prompt. I cannot interrupt it with ^C or ^Z either. /var/log/auth.log says: May 13 08:06:41 pluto su[4193]: Successful su for root by harri May 13 08:06:41 pluto su[4193]: + pts/3 harri:root May 13 08:06:41 pluto su[4193]: pam_unix(su:session): session opened for user root by harri(uid=1000) ps shows: % ps -ef | grep pts/3 harri 4007 4006 0 07:58 pts/3 00:00:00 bash root 4193 4007 0 08:06 pts/3 00:00:00 su root 4194 4193 0 08:06 pts/3 00:00:00 bash root 4209 4194 0 08:08 pts/3 00:00:00 su root 4210 4209 0 08:08 pts/3 00:00:00 bash root 4217 4210 0 08:08 pts/3 00:00:00 stty intr ^C So obviously 'stty' is to blame here (called from root's .bashrc, as it seems). But for 2.6.24.4 there is no such problem. Maybe you could spread some light here? Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-13 6:17 Harald Dunkel @ 2008-05-13 6:47 ` Vegard Nossum 2008-05-13 17:43 ` Harald Dunkel 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-05-13 6:47 UTC (permalink / raw) To: Harald Dunkel; +Cc: linux-kernel Hi, On Tue, May 13, 2008 at 8:17 AM, Harald Dunkel <harald.dunkel@t-online.de> wrote: > Hi folks, > > I haven't seen it mentioned here (hopefully I wasn't too blind to see): > > If I run "su someuser" as root, then it gets stuck. No prompt. I cannot > interrupt it with ^C or ^Z either. /var/log/auth.log says: ... > So obviously 'stty' is to blame here (called from root's .bashrc, as > it seems). But for 2.6.24.4 there is no such problem. > > > Maybe you could spread some light here? Can you try to run "strace su someuser" instead? This should produce some screenfulls of information, but will also pinpoint the location of the hang in userspace if it's an error somewhere in the path of a system call. Strace will output to stderr, so use 2>strace.txt to capture it to a file. If you do this as root, you should not have to enter any passwords, so they won't be logged either. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-13 6:47 ` Vegard Nossum @ 2008-05-13 17:43 ` Harald Dunkel 2008-05-13 19:46 ` Willy Tarreau 2008-05-14 7:34 ` Vegard Nossum 0 siblings, 2 replies; 54+ messages in thread From: Harald Dunkel @ 2008-05-13 17:43 UTC (permalink / raw) To: Vegard Nossum; +Cc: linux-kernel Vegard Nossum wrote: > > Can you try to run "strace su someuser" instead? > I tried, but its a Heisenbug: If I run it with strace (with or without -f), then it doesn't get stuck. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-13 17:43 ` Harald Dunkel @ 2008-05-13 19:46 ` Willy Tarreau 2008-05-14 4:55 ` Harald Dunkel 2008-05-14 7:34 ` Vegard Nossum 1 sibling, 1 reply; 54+ messages in thread From: Willy Tarreau @ 2008-05-13 19:46 UTC (permalink / raw) To: Harald Dunkel; +Cc: Vegard Nossum, linux-kernel On Tue, May 13, 2008 at 07:43:33PM +0200, Harald Dunkel wrote: > Vegard Nossum wrote: > > > >Can you try to run "strace su someuser" instead? > > > > I tried, but its a Heisenbug: If I run it with strace (with or without > -f), then it doesn't get stuck. even as root ? Because if you run strace on "su" as a user, you will lose the setuid which may change the conditions. Willy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-13 19:46 ` Willy Tarreau @ 2008-05-14 4:55 ` Harald Dunkel 2008-05-14 5:46 ` Willy Tarreau 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-14 4:55 UTC (permalink / raw) To: Willy Tarreau; +Cc: Vegard Nossum, linux-kernel Willy Tarreau wrote: > > even as root ? Because if you run strace on "su" as a user, you will > lose the setuid which may change the conditions. > It gets stuck _only_ if I am root. Running su without root permission is no problem. As written before, If I kick out the "stty intr '^C'" from root's .bashrc, then the second su succeeds. Running the stty command manually in this new shell is no problem. Can you reproduce this problem? It doesn't need XWindow or any special user. Simply login as root on tty2, and run "su". If it gets stuck, then you could login on tty3 and kill -9 ### the blocking stty. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-14 4:55 ` Harald Dunkel @ 2008-05-14 5:46 ` Willy Tarreau 0 siblings, 0 replies; 54+ messages in thread From: Willy Tarreau @ 2008-05-14 5:46 UTC (permalink / raw) To: Harald Dunkel; +Cc: Vegard Nossum, linux-kernel On Wed, May 14, 2008 at 06:55:24AM +0200, Harald Dunkel wrote: > Willy Tarreau wrote: > > > >even as root ? Because if you run strace on "su" as a user, you will > >lose the setuid which may change the conditions. > > > > It gets stuck _only_ if I am root. Running su without root permission > is no problem. OK. That's quite strange. > As written before, If I kick out the "stty intr '^C'" from root's > .bashrc, then the second su succeeds. Running the stty command > manually in this new shell is no problem. > > Can you reproduce this problem? It doesn't need XWindow or any > special user. Simply login as root on tty2, and run "su". If it > gets stuck, then you could login on tty3 and kill -9 ### the blocking > stty. I tried (I'm on 2.6.25.1). But neither "su", "su willy" nor "stty intr ^C" caused such a problem unfortunately. You may want to ask several people to test in the same environment as yours (same distro, etc...) in order to find whether it's a user-land bug or a kernel bug. Most likely it's a kernel bug or side effect which is triggered by your version of su or stty. > Regards > > Harri Willy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-13 17:43 ` Harald Dunkel 2008-05-13 19:46 ` Willy Tarreau @ 2008-05-14 7:34 ` Vegard Nossum 2008-05-14 17:05 ` Harald Dunkel 1 sibling, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-05-14 7:34 UTC (permalink / raw) To: Harald Dunkel; +Cc: linux-kernel On Tue, May 13, 2008 at 7:43 PM, Harald Dunkel <harald.dunkel@t-online.de> wrote: > Vegard Nossum wrote: > > > Can you try to run "strace su someuser" instead? > > I tried, but its a Heisenbug: If I run it with strace (with or without > -f), then it doesn't get stuck. Ah, tough luck. What if you attach an strace to the already stuck process? Like strace -p pid. It's a long shot, but worth a try, I suppose. You could also try to use sysrq to output some useful information about the stuck process. See Documentation/sysrq.txt for more information. Gotta run, good luck! Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-14 7:34 ` Vegard Nossum @ 2008-05-14 17:05 ` Harald Dunkel 2008-05-14 17:17 ` Vegard Nossum 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-14 17:05 UTC (permalink / raw) To: Vegard Nossum; +Cc: linux-kernel Vegard Nossum wrote: > > Ah, tough luck. What if you attach an strace to the already stuck > process? Like strace -p pid. It's a long shot, but worth a try, I > suppose. > As suggested I attached strace to the blocking stty. I got a continuous flow of : : ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- : : I also moved back to an old version of coreutils from about 4 months ago. Same result. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-14 17:05 ` Harald Dunkel @ 2008-05-14 17:17 ` Vegard Nossum 2008-05-14 17:35 ` Alan Cox 0 siblings, 1 reply; 54+ messages in thread From: Vegard Nossum @ 2008-05-14 17:17 UTC (permalink / raw) To: Harald Dunkel, Alan Cox; +Cc: linux-kernel On Wed, May 14, 2008 at 7:05 PM, Harald Dunkel <harald.dunkel@t-online.de> wrote: > Vegard Nossum wrote: >> >> Ah, tough luck. What if you attach an strace to the already stuck >> process? Like strace -p pid. It's a long shot, but worth a try, I >> suppose. >> > > As suggested I attached strace to the blocking stty. I got a > continuous flow of > > : > : > ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = > ? ERESTARTSYS (To be restarted) > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- > : > : > > I also moved back to an old version of coreutils from about > 4 months ago. Same result. Looks good. Adding Alan Cox to conversation. Original report: -------- If I run "su someuser" as root, then it gets stuck. No prompt. I cannot interrupt it with ^C or ^Z either. /var/log/auth.log says: May 13 08:06:41 pluto su[4193]: Successful su for root by harri May 13 08:06:41 pluto su[4193]: + pts/3 harri:root May 13 08:06:41 pluto su[4193]: pam_unix(su:session): session opened for user root by harri(uid=1000) ps shows: % ps -ef | grep pts/3 harri 4007 4006 0 07:58 pts/3 00:00:00 bash root 4193 4007 0 08:06 pts/3 00:00:00 su root 4194 4193 0 08:06 pts/3 00:00:00 bash root 4209 4194 0 08:08 pts/3 00:00:00 su root 4210 4209 0 08:08 pts/3 00:00:00 bash root 4217 4210 0 08:08 pts/3 00:00:00 stty intr ^C So obviously 'stty' is to blame here (called from root's .bashrc, as it seems). But for 2.6.24.4 there is no such problem. -------- Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-14 17:17 ` Vegard Nossum @ 2008-05-14 17:35 ` Alan Cox 2008-05-18 17:56 ` Harald Dunkel 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-05-14 17:35 UTC (permalink / raw) To: Vegard Nossum; +Cc: Harald Dunkel, linux-kernel > If I run "su someuser" as root, then it gets stuck. No prompt. I cannot > interrupt it with ^C or ^Z either. /var/log/auth.log says: Is it then killable remotely, can you gdb it ? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-14 17:35 ` Alan Cox @ 2008-05-18 17:56 ` Harald Dunkel 2008-05-18 17:51 ` Alan Cox 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-18 17:56 UTC (permalink / raw) To: Alan Cox; +Cc: Vegard Nossum, linux-kernel Alan Cox wrote: >> If I run "su someuser" as root, then it gets stuck. No prompt. I cannot >> interrupt it with ^C or ^Z either. /var/log/auth.log says: > > Is it then killable remotely, can you gdb it ? > Attaching to the running stty failed: # gdb stty 26465 GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... Attaching to program: /var/tmp/coreutils-6.10/coreutils-6.10/src/stty, process 26465 /tmp/buildd/gdb-6.8/gdb/linux-nat.c:988: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. : I can kill it with -9, but INTR, SEGV, BUS, etc. don't interrupt it. And if I run stty inside of the debugger (or on the command line), then it doesn't get stuck. I have changed stty.c to do some print statements, and changed root's .bashrc to use it for "stty intr ^C": It gets stuck inside tcsetattr(): printf("here we are\n"); if (tcsetattr (STDIN_FILENO, TCSADRAIN, &mode)) error (EXIT_FAILURE, errno, "%s", device_name); printf("born to be kings\n"); The previous tcgetattr() to read the old terminal settings shows no problem (AFAICS). Replacing TCSADRAIN by 0 doesn't help. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-18 17:56 ` Harald Dunkel @ 2008-05-18 17:51 ` Alan Cox 2008-05-20 19:01 ` Harald Dunkel 0 siblings, 1 reply; 54+ messages in thread From: Alan Cox @ 2008-05-18 17:51 UTC (permalink / raw) To: Harald Dunkel; +Cc: Vegard Nossum, linux-kernel > This GDB was configured as "x86_64-linux-gnu"... > Attaching to program: /var/tmp/coreutils-6.10/coreutils-6.10/src/stty, process 26465 > /tmp/buildd/gdb-6.8/gdb/linux-nat.c:988: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. > A problem internal to GDB has been detected, So your gdb is broken, your stty is doign stuff nobody else seems to be seeing ? My first thought is to suspect the distro/source as I've still had no other equivalent reports and given the gdb spew. I'll have a dig further however as the GDB spew itself might actually be a useful clue. Not sure I can do much without a proper trace however (if gdb is borked on your box is strace giving sane reports ?) Alan ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-18 17:51 ` Alan Cox @ 2008-05-20 19:01 ` Harald Dunkel 2008-05-20 19:12 ` david 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-20 19:01 UTC (permalink / raw) To: Alan Cox; +Cc: Vegard Nossum, linux-kernel Alan Cox wrote: >> This GDB was configured as "x86_64-linux-gnu"... >> Attaching to program: /var/tmp/coreutils-6.10/coreutils-6.10/src/stty, process 26465 >> /tmp/buildd/gdb-6.8/gdb/linux-nat.c:988: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. >> A problem internal to GDB has been detected, > > So your gdb is broken, your stty is doign stuff nobody else seems to be > seeing ? My first thought is to suspect the distro/source as I've still > had no other equivalent reports and given the gdb spew. > I doubt that Debian is to blame here. I get the same with native gdb-6.8 and native coreutils-6.10: # /usr/local/stow/gdb-6.8/bin/gdb ./stty 11944 GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... Attaching to program: /var/tmp/coreutils-6.10/coreutils-6.10/src/stty, process 11944 ../../gdb/linux-nat.c:988: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) y ../../gdb/linux-nat.c:988: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Create a core file of GDB? (y or n) n > I'll have a dig further however as the GDB spew itself might actually be > a useful clue. Not sure I can do much without a proper trace however (if > gdb is borked on your box is strace giving sane reports ?) > As written earlier in this thread I can attach strace to the running stty. It prints a continuous flow of : ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost isig icanon echo ...}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- : Using strace -v I get : ioctl(0, SNDCTL_TMR_START or TCSETS, {c_iflags=0x500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = ? ERESTARTSYS (To be restarted) --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- --- SIGTTOU (Stopped (tty output)) @ 0 (0) --- : Surely I am no specialist for termio. Please mail if I can help. BTW, I am on 2.6.25.4 now. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-20 19:01 ` Harald Dunkel @ 2008-05-20 19:12 ` david 2008-05-20 20:26 ` Harald Dunkel 0 siblings, 1 reply; 54+ messages in thread From: david @ 2008-05-20 19:12 UTC (permalink / raw) To: Harald Dunkel; +Cc: Alan Cox, Vegard Nossum, linux-kernel On Tue, 20 May 2008, Harald Dunkel wrote: > Alan Cox wrote: >>> This GDB was configured as "x86_64-linux-gnu"... >>> Attaching to program: /var/tmp/coreutils-6.10/coreutils-6.10/src/stty, >>> process 26465 >>> /tmp/buildd/gdb-6.8/gdb/linux-nat.c:988: internal-error: linux_nat_attach: >>> Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && >>> WSTOPSIG (status) == SIGSTOP' failed. >>> A problem internal to GDB has been detected, >> >> So your gdb is broken, your stty is doign stuff nobody else seems to be >> seeing ? My first thought is to suspect the distro/source as I've still >> had no other equivalent reports and given the gdb spew. >> > > I doubt that Debian is to blame here. I get the same with native > gdb-6.8 and native coreutils-6.10: try killing syslog and then see if you continue to have the problem. It's possible that what's happening is syslog is getting stuck and su is sitting waiting for syslog to process the log entry. David Lang ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-20 19:12 ` david @ 2008-05-20 20:26 ` Harald Dunkel 2008-05-20 20:38 ` Willy Tarreau 0 siblings, 1 reply; 54+ messages in thread From: Harald Dunkel @ 2008-05-20 20:26 UTC (permalink / raw) To: david; +Cc: Alan Cox, Vegard Nossum, linux-kernel david@lang.hm wrote: > > try killing syslog and then see if you continue to have the problem. > It's possible that what's happening is syslog is getting stuck and su is > sitting waiting for syslog to process the log entry. > No improvement: I killed syslogd, and yet stty gets stuck. But surely it was a smart idea. Regards Harri ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: 2.6.25.3: su gets stuck for root 2008-05-20 20:26 ` Harald Dunkel @ 2008-05-20 20:38 ` Willy Tarreau 0 siblings, 0 replies; 54+ messages in thread From: Willy Tarreau @ 2008-05-20 20:38 UTC (permalink / raw) To: Harald Dunkel; +Cc: david, Alan Cox, Vegard Nossum, linux-kernel On Tue, May 20, 2008 at 10:26:16PM +0200, Harald Dunkel wrote: > david@lang.hm wrote: > > > >try killing syslog and then see if you continue to have the problem. > >It's possible that what's happening is syslog is getting stuck and su is > >sitting waiting for syslog to process the log entry. > > > > No improvement: I killed syslogd, and yet stty gets stuck. > > But surely it was a smart idea. have you checked in /proc/$(pidof stty)/fd/ to see if stty is bound to any particular file descriptor ? Maybe you'll find the source of the blocking operation here. You can find a unix socket attached to another process, it may be a tty, a pipe, or even a device (eg: /dev/random when there is no entropy left). Same for "su" BTW. Regards, willy ^ permalink raw reply [flat|nested] 54+ messages in thread
end of thread, other threads:[~2008-07-06 18:49 UTC | newest] Thread overview: 54+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-02 1:31 2.6.25.3: su gets stuck for root Joe Peterson 2008-06-02 5:12 ` Harald Dunkel 2008-06-02 5:32 ` Willy Tarreau 2008-06-02 5:55 ` Joe Peterson 2008-06-02 8:10 ` Alan Cox 2008-06-02 9:01 ` David Newall 2008-06-02 9:20 ` Alan Cox 2008-06-02 10:16 ` Vegard Nossum 2008-06-02 10:39 ` Vegard Nossum 2008-06-02 10:52 ` Alan Cox 2008-06-02 10:57 ` Vegard Nossum 2008-06-02 12:28 ` Alan Cox 2008-06-02 14:31 ` Vegard Nossum 2008-06-02 10:50 ` Alan Cox 2008-06-17 15:32 ` Joe Peterson 2008-06-02 15:26 ` Joe Peterson 2008-06-02 15:51 ` Alan Cox 2008-06-02 16:03 ` Joe Peterson 2008-06-04 14:43 ` Joe Peterson 2008-06-04 15:16 ` Alan Cox 2008-06-04 16:52 ` Joe Peterson 2008-06-04 17:10 ` Alan Cox 2008-06-04 20:32 ` Joe Peterson 2008-06-11 14:04 ` Joe Peterson 2008-06-12 11:52 ` Vegard Nossum 2008-06-14 1:49 ` Joe Peterson 2008-06-14 7:45 ` Vegard Nossum 2008-06-14 17:43 ` Joe Peterson 2008-06-14 20:34 ` Vegard Nossum 2008-06-14 20:52 ` Joe Peterson 2008-06-14 21:26 ` Vegard Nossum 2008-06-14 21:34 ` Joe Peterson 2008-07-02 18:03 ` tty session leader issue (was Re: 2.6.25.3: su gets stuck for root) Joe Peterson 2008-07-02 19:21 ` markus reichelt 2008-07-06 14:08 ` Tim Connors 2008-07-06 16:44 ` Alan Cox 2008-07-06 18:49 ` tty session leader issue [cause now known!] " Joe Peterson 2008-06-02 5:42 ` 2.6.25.3: su gets stuck for root Joe Peterson -- strict thread matches above, loose matches on Subject: below -- 2008-05-13 6:17 Harald Dunkel 2008-05-13 6:47 ` Vegard Nossum 2008-05-13 17:43 ` Harald Dunkel 2008-05-13 19:46 ` Willy Tarreau 2008-05-14 4:55 ` Harald Dunkel 2008-05-14 5:46 ` Willy Tarreau 2008-05-14 7:34 ` Vegard Nossum 2008-05-14 17:05 ` Harald Dunkel 2008-05-14 17:17 ` Vegard Nossum 2008-05-14 17:35 ` Alan Cox 2008-05-18 17:56 ` Harald Dunkel 2008-05-18 17:51 ` Alan Cox 2008-05-20 19:01 ` Harald Dunkel 2008-05-20 19:12 ` david 2008-05-20 20:26 ` Harald Dunkel 2008-05-20 20:38 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox