Re: Bisected regression in Linux 3.7.0-rc1, hang on login caused by replace_fd()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: Pavel Roskin <proski@gnu.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Bisected regression in Linux 3.7.0-rc1, hang on login caused by replace_fd()
Date: Mon, 15 Oct 2012 22:51:27 +0100	[thread overview]
Message-ID: <20121015215127.GK2616@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20121015171437.32e9a4ac@mj>

On Mon, Oct 15, 2012 at 05:14:37PM -0400, Pavel Roskin wrote:
> Hello!
> 
> I tried the current mainline Linux on Fedora 16 x64_64 and found that I
> cannot login in gdm.  I'm using gdm with LXDE.  After a few minutes,
> kernel messages appear indicating a hung process gnome-settings-daemon.

What is it hanging on?

> I'll try to figure out why replace_fd() is not good in
> umh_pipe_setup(), but I'd like to report the problem as soon as
> possible, as it's very visible to the users and I don't want others to
> spend hours bisecting it.
> 
> -- 
> Regards,
> Pavel Roskin

[snip]

> [  241.483499] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  241.484888] gnome-settings- D 0000000000000000  6136  3155      1 0x00000000
> [  241.486265]  ffff8801298afcc8 0000000000000046 ffff88012993b420 0000000000000246
> [  241.487652]  ffff8801298affd8 ffff8801298affd8 ffff8801298affd8 ffff8801298affd8
> [  241.489035]  ffff88012993a2c0 ffff88012993b420 ffff880128cb0b80 ffff88012993b420
> [  241.490421] Call Trace:
> [  241.491765]  [<ffffffff815d8064>] schedule+0x24/0x70
> [  241.493116]  [<ffffffff81043945>] exit_mm+0x85/0x120
> [  241.494467]  [<ffffffff8104628f>] do_exit+0x13f/0x580
> [  241.495784]  [<ffffffff81046957>] do_group_exit+0x47/0xb0
> [  241.497090]  [<ffffffff810597ef>] get_signal_to_deliver+0x21f/0x580
> [  241.498384]  [<ffffffff81002014>] do_signal+0x24/0x120
> [  241.499673]  [<ffffffff815da28c>] ? sysret_signal+0x5/0x47
> [  241.500964]  [<ffffffff81002195>] do_notify_resume+0x65/0xa0
> [  241.502239]  [<ffffffff8129342e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  241.503525]  [<ffffffff815da51a>] int_signal+0x12/0x17
> [  241.504813] no locks held by gnome-settings-/3155.
> [  241.506095] INFO: task gnome-settings-:3156 blocked for more than 120 seconds.
> [  241.507405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  241.508721] gnome-settings- D 00000000ffffffff  5104  3156      1 0x00000000
> [  241.510056]  ffff88011c80bcc8 0000000000000046 ffff88011c80bc78 ffffffff8109ad9d
> [  241.511414]  ffff88011c80bfd8 ffff88011c80bfd8 ffff88011c80bfd8 ffff88011c80bfd8
> [  241.512780]  ffff8801279fd6e0 ffff88012993a2c0 ffff880129809d28 ffff88012993a2c0
> [  241.514134] Call Trace:
> [  241.515480]  [<ffffffff8109ad9d>] ? trace_hardirqs_on_caller+0x7d/0x120
> [  241.516857]  [<ffffffff815d8064>] schedule+0x24/0x70
> [  241.518219]  [<ffffffff81043945>] exit_mm+0x85/0x120
> [  241.519592]  [<ffffffff8104628f>] do_exit+0x13f/0x580
> [  241.520952]  [<ffffffff81046957>] do_group_exit+0x47/0xb0
> [  241.522315]  [<ffffffff810597ef>] get_signal_to_deliver+0x21f/0x580
> [  241.523687]  [<ffffffff81002014>] do_signal+0x24/0x120
> [  241.525066]  [<ffffffff8109b8a4>] ? __lock_is_held+0x54/0x70
> [  241.526436]  [<ffffffff815da28c>] ? sysret_signal+0x5/0x47
> [  241.527841]  [<ffffffff81002195>] do_notify_resume+0x65/0xa0
> [  241.529237]  [<ffffffff8129342e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  241.530625]  [<ffffffff815da51a>] int_signal+0x12/0x17
> [  241.532009] no locks held by gnome-settings-/3156.
> [  241.533386] INFO: task gnome-settings-:3157 blocked for more than 120 seconds.
> [  241.534778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  241.536196] gnome-settings- D 00000000ffffffff  5432  3157      1 0x00000000
> [  241.537632]  ffff8801279efcc8 0000000000000046 ffff88012993c580 0000000000000246
> [  241.539069]  ffff8801279effd8 ffff8801279effd8 ffff8801279effd8 ffff8801279effd8
> [  241.540515]  ffff88012661e840 ffff88012993c580 ffff880128cb0b80 ffff88012993c580
> [  241.541961] Call Trace:
> [  241.543377]  [<ffffffff815d8064>] schedule+0x24/0x70
> [  241.544802]  [<ffffffff81043945>] exit_mm+0x85/0x120
> [  241.546226]  [<ffffffff8104628f>] do_exit+0x13f/0x580
> [  241.547642]  [<ffffffff81046957>] do_group_exit+0x47/0xb0
> [  241.549053]  [<ffffffff810597ef>] get_signal_to_deliver+0x21f/0x580
> [  241.550468]  [<ffffffff815d940b>] ? _raw_spin_unlock_irq+0x2b/0x50
> [  241.551878]  [<ffffffff81002014>] do_signal+0x24/0x120
> [  241.553297]  [<ffffffff81122e77>] ? might_fault+0x57/0xb0
> [  241.554726]  [<ffffffff815da28c>] ? sysret_signal+0x5/0x47
> [  241.556148]  [<ffffffff81002195>] do_notify_resume+0x65/0xa0
> [  241.557588]  [<ffffffff8129342e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  241.557590]  [<ffffffff815da51a>] int_signal+0x12/0x17
> [  241.557595] no locks held by gnome-settings-/3157.

> @@ -443,14 +453,23 @@ static void wait_for_dump_helpers(struct file *file)
>  static int umh_pipe_setup(struct subprocess_info *info, struct cred *new)
>  {
>  	struct file *files[2];
> +	struct fdtable *fdt;
>  	struct coredump_params *cp = (struct coredump_params *)info->data;
> +	struct files_struct *cf = current->files;
>  	int err = create_pipe_files(files, 0);
>  	if (err)
>  		return err;
>  
>  	cp->file = files[1];
>  
> -	replace_fd(0, files[0], 0);
> +	sys_close(0);
> +	fd_install(0, files[0]);
> +	spin_lock(&cf->file_lock);
> +	fdt = files_fdtable(cf);
> +	__set_open_fd(0, fdt);
> +	__clear_close_on_exec(0, fdt);
> +	spin_unlock(&cf->file_lock);

Very interesting.  Let's see what could be going on there.  First of
all, create_pipe_files() has return 0.  There's only one exit where
it can do so and it dereferences both files[0] and files[1] on the
way there.  Since that hadn't oopsed, they are both non-NULL.

IOW, replace_fd(0, files[0], 0) is equivalent to
{
	int err;
	struct file *f = files[0];
        struct files_struct *files = current->files;
        if (0 >= rlimit(RLIMIT_NOFILE))
                return -EMFILE;

        spin_lock(&files->file_lock);
        err = expand_files(files, 0);
        if (unlikely(err < 0))
                goto out_unlock;
        return do_dup2(files, f, 0, 0);

out_unlock:
        spin_unlock(&files->file_lock);
        return err;
}

Aha.  We have the first thing to test - is RLIMIT_NOFILE set to 0 for that
process, by any chance?  We definitedly need to check replace_fd() return
value, BTW.  OK, suppose it wasn't 0.  So we'd done expand_files(files, 0).
expand_files() begins with
        /* Do we need to expand? */
        if (nr < fdt->max_fds)
                return 0;
and ->max_fds is never going to be less than NR_OPEN_DEFAULT, which is
definitely positive.  So expand_files(files, 0) is going to return 0
without doing anything.  And we get to do_dup2():
        fdt = files_fdtable(files);
        tofree = fdt->fd[0];
        if (!tofree && fd_is_open(0, fdt))
                goto Ebusy;
that really shouldn't be possible to hit in this case - it can happen
only for shared descriptor table.  Another thread would need to have
already reserved fd 0, but hadn't got around to installing a reference
to struct file into it.  But we are called from the freshly created
kernel thread, about to exec some userland binary.  Those are spawned
without CLONE_FILES in flags, so the table can't be shared.  It's fresh
from dup_fd() and this bit
                        /*
                         * The fd may be claimed in the fd bitmap but not yet
                         * instantiated in the files array if a sibling thread
                         * is partway through open().  So make sure that this
                         * fd is available to the new process.
                         */
                        __clear_open_fd(open_files - i, new_fdt);
in there would have prevented anything of that kind.

Now we have tofree equal to whatever we used to have in ->fd[0] (realistically
that's going to be NULL).  Next we have
        get_file(file);
        rcu_assign_pointer(fdt->fd[0], file);
        __set_open_fd(0, fdt);
        if (0 & O_CLOEXEC)
                __set_close_on_exec(fd, fdt);
        else
                __clear_close_on_exec(fd, fdt);
        spin_unlock(&files->file_lock);

        if (tofree)
                filp_close(tofree, files);

Now, note that sys_close(0) would have done something very similar -
        file = fdt->fd[0];
        if (!file)
                goto out_unlock;
        rcu_assign_pointer(fdt->fd[0], NULL);
        __clear_close_on_exec(0, fdt);
        __put_unused_fd(files, 0);
        spin_unlock(&files->file_lock);
        filp_close(file, files);
'file' here is what we had in 'tofree'.  We'd ripped it out of descriptor
table and done filp_close() on it.  When we follow with 
	fd_install(0, files[0]);
...
	__set_open_fd(0, fdt);
	__clear_close_on_exec(0, fdt);
we have get_file() done on files[0], files[0] put into ->fd[0] (fd_install()),
result of __put_unused(files, 0); undone (by __set_open_fd()) and we come
to exactly the same situation as with replace_fd().

	All right, so the next step in debugging is to print the damn
return value of replace_fd().  That should narrow the things down.
Another interesting question, of course, is what makes that gnome turd
coredump on each boot?

next prev parent reply	other threads:[~2012-10-15 21:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-15 21:14 Bisected regression in Linux 3.7.0-rc1, hang on login caused by replace_fd() Pavel Roskin
2012-10-15 21:51 ` Al Viro [this message]
2012-10-15 23:11   ` Pavel Roskin
2012-10-15 23:27     ` Al Viro
2012-10-15 23:40       ` Pavel Roskin
2012-10-15 23:48         ` Al Viro
2012-10-16 15:02           ` Pavel Roskin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121015215127.GK2616@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=proski@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.