public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Stephen Lord <lord@xfs.org>
Cc: "Prarit Bhargava" <prarit@sgi.com>,
	"Pozsár Balázs" <pozsy@uhulinux.hu>,
	linux-kernel@vger.kernel.org, davej@redhat.com, ak@suse.de
Subject: Re: Race condition in module load causing undefined symbols
Date: Thu, 28 Jul 2005 20:42:48 +0100	[thread overview]
Message-ID: <28920.1122579768@warthog.cambridge.redhat.com> (raw)
In-Reply-To: <42B02004.6020306@xfs.org>


Hi Steve,

Someone's finally waved this discussion in my direction.

> Still puzzled about what could have been fixed in user space since this
> appears to affect more than one shell. Module loading appears to be
> very synchronous, so unless the shell was not waiting for exit status
> on children correctly, it seems hard to explain in user space.

The problem with nash is very simple, and may be duplicated in other shells:

 (1) The patch to the module wangling patch that I made makes use of
     stop_machine_run() to insert a module into the module list or remove it
     from the module list.

     This is done because certain things that look at the list have to run
     without locks, so the only way to be certain they aren't going to run is
     to ensure that _nothing_ else is going to run.

 (2) stop_machine_run() creates a bunch of kernel threads to hog the other
     CPUs with interrupts disabled whilst one CPU does the actual work.

 (3) These kernel threads are reparented to the init process (PID 1).

 (4) When "parentless" threads exit, whatever process is running as PID 1 gets
     to deal with the zombies and will get a wait() event for each.

 (5) nash runs as PID 1 during boot.

 (6) nash was NOT checking the pid returned by its calls to wait(); in
     especial, when it forked off an insmod process, it would then simply wait
     for the first wait event to happen and continue on, without checking that
     the process it was waiting for had actually finished.

That is the basic problem being seen. It's just that it rarely happens without
this patch, but the problem is still there in nash, and could be triggered due
to other parentless processes exiting or dying.

I got nash patched, but it seems to be taking awhile to percolate.

David

  reply	other threads:[~2005-07-28 19:43 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-10 14:03 Race condition in module load causing undefined symbols Stephen Lord
2005-06-10 18:25 ` Andrew Morton
2005-06-10 19:06   ` Steve Lord
2005-06-11  3:30   ` Stephen Lord
2005-06-11  8:26   ` Pozsár Balázs
2005-06-11 13:23     ` Steve Lord
2005-06-11 15:05       ` Pozsár Balázs
2005-06-11 17:56         ` Stephen Lord
2005-06-11 19:00           ` Andrew Morton
2005-06-11 19:08             ` Pozsár Balázs
2005-06-11 20:09             ` Steve Lord
2005-06-11 20:18               ` Pozsár Balázs
2005-06-14 13:34             ` Steve Lord
2005-06-14 15:33               ` K.R. Foley
2005-06-14 15:36               ` K.R. Foley
2005-06-14 16:38                 ` Steve Lord
2005-06-14 16:56                   ` Andi Kleen
2005-06-14 17:16                     ` Steve Lord
2005-06-14 20:56                     ` Pozsár Balázs
2005-06-14 17:10                   ` K.R. Foley
2005-06-14 17:39                     ` Steve Lord
2005-06-14 18:23                       ` Prarit Bhargava
2005-06-14 19:27                         ` Steve Lord
2005-06-14 19:32                           ` Christoph Hellwig
2005-06-14 20:59                         ` Pozsár Balázs
2005-06-15 11:28                           ` Prarit Bhargava
2005-06-15 11:34                             ` Pozsár Balázs
2005-06-15 11:35                               ` Prarit Bhargava
2005-06-15 11:43                                 ` Pozsár Balázs
2005-06-15 12:33                             ` Stephen Lord
2005-07-28 19:42                               ` David Howells [this message]
2005-06-12  6:49 ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28920.1122579768@warthog.cambridge.redhat.com \
    --to=dhowells@redhat.com \
    --cc=ak@suse.de \
    --cc=davej@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lord@xfs.org \
    --cc=pozsy@uhulinux.hu \
    --cc=prarit@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox