linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
To: linux-hotplug@vger.kernel.org
Subject: Re: [PATCH v2] udevd: don't use alarm() for timeouts
Date: Tue, 26 May 2009 18:05:01 +0000	[thread overview]
Message-ID: <4A1C2F4D.6000500@tuffmail.co.uk> (raw)
In-Reply-To: <4A1BC506.3090804@tuffmail.co.uk>

Kay Sievers wrote:
> On Tue, May 26, 2009 at 12:31, Alan Jenkins <alan-jenkins@tuffmail.co.uk> wrote:
>   
>> alarm() is per-process; we can't use it for timeouts in a multi-threaded
>> udevd.  We need to explicitly track the total time spent waiting for
>> each event.
>>
>> There is an issue here if the timeout expires in run_program().  If
>> run_program() returns without calling wait(), the process it forked
>> will become a zombie when it finally exits.  Currently, udev-event is
>> implemented as a process, so the zombie will be reparented to init and
>> reaped once the event is finished.  But that won't work for threads.
>>
>> The solution is to fork twice:
>>
>> udevd (event thread)
>>  udevd child process
>>  command
>>
>> When the timeout expires, the event thread can return immediately.
>> The child process will stay blocked in wait(), until the command finally
>> finishes (or is killed).  We'll get a nice process tree showing that the
>> the hung process was started by udevd :-).
>>     
>
> The event runs all programs serialized, one after the other, can't we
> just kill the program that does not return in time, and wait for it to
> cleanup the process, instead of just exiting the event process?
>   

That makes sense if you also change the timeout model - from per-event
timeouts, to per-command timeouts.

The second fork is cheap, because it's a vfork.  But if you like
per-command timeouts, I would be happy to see it go away.

I'm not sure about killing.  Do we need to escalate to SIGKILL?  Do we
e.g. allow half the timeout before sending SIGTERM, then another half
before sending SIGKILL?  The process could even be unkillable - is it ok
to block after SIGKILL, or do we need another timeout?

I could keep a list of timed out processes instead, and reap them on
SIGCHLD.  Would that be better?

There are other workarounds for the lack of a timeout in sys_wait(), so
I don't think that's a problem.  We can require that commands close
stdout & stderr pipes when they exit - i.e. do not pass them on to a
long-running child.  (At the moment there's a debian script "net.agent"
which has to do this for debug mode - it would need fixing to always do it).

Alan

  parent reply	other threads:[~2009-05-26 18:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-26 10:31 [PATCH v2] udevd: don't use alarm() for timeouts Alan Jenkins
2009-05-26 15:37 ` Kay Sievers
2009-05-26 18:05 ` Alan Jenkins [this message]
2009-05-26 18:38 ` Kay Sievers
2009-05-27  8:57 ` Alan Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A1C2F4D.6000500@tuffmail.co.uk \
    --to=alan-jenkins@tuffmail.co.uk \
    --cc=linux-hotplug@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).