linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
To: linux-hotplug@vger.kernel.org
Subject: Re: [GIT] Experimental threaded udev
Date: Mon, 01 Jun 2009 13:46:17 +0000	[thread overview]
Message-ID: <4A23DBA9.30501@tuffmail.co.uk> (raw)
In-Reply-To: <4A1EA138.10400@tuffmail.co.uk>

Kay Sievers wrote:
> On Mon, Jun 1, 2009 at 11:29, Alan Jenkins <alan-jenkins@tuffmail.co.uk> wrote:
>   
>>> The thread- and worker-versions do not create as many COW page-faults in
>>> the daemon after every cloned event-process, and therefore need much
>>> less CPU.
>>>
>>> At least on the dual-core laptop here, the pool of workers seems to be
>>> faster than the threads.
>>>
>>>       
>> Great, you beat me to it.  It makes sense that this would be _bit_ faster
>> than threads.  As I say I tried to avoid the page faults caused (somehow) by
>> forking the extra thread-stack mappings, but it didn't really work out.  I'm
>> surprised by quite how much faster it is though!
>>     
>
> Yeah, the 1.2 sec system time looked a bit too scary.
>
>   
>> I have some thoughts which may help in bringing the code up from "rough
>> hack" quality :-).
>>     
>
> Cool. :)
>
>   
>> Aren't signal queues unreliable?  If you exceed the maximum number of queued
>> signals, sigqueue will fail with EAGAIN, and I don't think there's a
>> blocking version :-).
>>     
>
> Right, it's a rlimit, and I think I checked and remember 40.000+
> signals here. The workers could detect, and re-send a non-queued
> signal, if needed.
>   

Non-queued signal transmission isn't blocking either.  The signal just 
gets dropped if there's already one in-flight.

>   
>> I think a  pipe would be better
>>     
>
> Maybe. I was lazy and tried to avoid file descriptors, in case we get
> a really large number of workers to maintain. :)
>
>   
>> or maybe you can do something with netlink.
>>     
>
> Right, but then, I guess, we would need to do MSG_PEEK, or something,
> to find out if the main daemon received a kernel message or a worker
> message, before trying to do a receive_device().
>   

Ok.  I meant we could send all the completions down the same pipe, which 
wouldn't require lots of FDs.  Before this patch, we were happily 
sharing kernel_monitor->sock between all the child processes.



>> +                       /* wait for more device messages from udevd */
>> +                       do
>> +                               dev = udev_monitor_receive_device(monitor);
>> +                       while (dev = NULL);
>>
>> I think this loop should finish when the signal handler sets worker_exit?
>>  But maybe you didn't actually install a handler for SIGTERM and it's still
>> being reset to the default action:
>>     
>
> Yeah, it's not handled with worker_exit. Now, it's not reliable to
> kill event processes from something else than the main daemon. The
> worker_exit might be nice for valgrind tests though.
>
>   
>> I'm not sure you're handling worker processes crashing, it looks like you
>> might leave the event in limbo if you get SIGCHLD for a worker in state
>> WORKER_RUNNING.  I'm sure you'll test that though.
>>     
>
> Maybe we can set the event to QUEUED, when a worker dies with an event attached.
>   

Retry the event, you mean?  How many times? :-).

The code looked like you freed the worker, but left the event RUNNING, 
and it would never be released.  I would delete the event instead, just 
like the old system.

I haven't read V2 yet though, maybe you fixed it.

>   
>> Talking of unexpected crashes.  I anticipated using a connection-oriented
>> socketpair for passing events.  Using netlink for this means the worker
>> threads don't get a nice notification if the daemon dies without killing
>> them.
>>     
>
> Right, thats not too nice. I guess we have pretty much lost if the
> main daemon dies unexpectedly. We need to find out, if we want netlink
> and signals, or socketpair()s here, I guess.
>
>   
>> Unless this is covered by udevd being the session leader, or
>> something like that?  I'll test this empirically - maybe it's not important,
>> but I think it's good to know what will happen.
>>     
>
> If the main daemon exits normally, it sends TERM to the entire process group
>
>   
>> BTW I had a go at this too, but a couple of my workers fail in about 1 run
>> out of 20, apparently because calloc() returns NULL without setting errno.
>>  I'm sure it's my fault but I'll try to track it down.  Obviously I'll let
>> you know if your patch could be affected by the same problem.
>>     
>
> Oh, strange. Would be good to know what causes this.
>   
Don't worry about it, it was just my buggy code :-).  I used event->udev 
after udev_event_unref(event).

Thanks
Alan


  parent reply	other threads:[~2009-06-01 13:46 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-28 14:35 [GIT] Experimental threaded udev Alan Jenkins
2009-05-28 15:09 ` Kay Sievers
2009-05-28 15:39 ` Alan Jenkins
2009-05-29 17:53 ` Alan Jenkins
2009-05-29 18:11 ` Kay Sievers
2009-06-01  2:41 ` Kay Sievers
2009-06-01  9:29 ` Alan Jenkins
2009-06-01 11:32 ` Kay Sievers
2009-06-01 12:33 ` Kay Sievers
2009-06-01 13:30 ` Kay Sievers
2009-06-01 13:46 ` Alan Jenkins [this message]
2009-06-01 13:57 ` Kay Sievers
2009-06-01 16:22 ` Kay Sievers
2009-06-01 16:24 ` Alan Jenkins
2009-06-01 19:39 ` Kay Sievers
2009-06-02  4:58 ` Kay Sievers
2009-06-02  9:13 ` Alan Jenkins
2009-06-02  9:26 ` Alan Jenkins
2009-06-02 11:39 ` Kay Sievers
2009-06-02 14:05 ` Kay Sievers
2009-06-03 19:44 ` Kay Sievers
2009-06-03 20:46 ` Alan Jenkins
2009-06-03 22:20 ` Kay Sievers
2009-06-03 23:53 ` Kay Sievers
2009-06-06 14:20 ` Kay Sievers
2009-06-06 17:01 ` Bryan Kadzban
2009-06-08 11:45 ` Scott James Remnant
2009-06-08 16:29 ` Bryan Kadzban

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A23DBA9.30501@tuffmail.co.uk \
    --to=alan-jenkins@tuffmail.co.uk \
    --cc=linux-hotplug@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).