* [PATCH] fix a race condition on udev worker shutdown
@ 2011-10-08 9:14 Steve Langasek
2011-10-09 2:56 ` Kay Sievers
0 siblings, 1 reply; 2+ messages in thread
From: Steve Langasek @ 2011-10-08 9:14 UTC (permalink / raw)
To: linux-hotplug
[-- Attachment #1: Type: text/plain, Size: 3606 bytes --]
Hi there,
A race condition in udev has turned up in Ubuntu's initramfs, causing
60-second delays in the boot sequence because udevd loses track of one of
its workers: it's waiting for a pending event to be processed, but the
worker it was dispatched to has already exited because the event and signal
were received by the worker at the same time and the signal took precedence.
There are a few ways we could deal with this.
- Let udevd clear its list harder when a worker exits, since if the process
hasn't already sent back its information it's not going to do so from
beyond the grave and we might as well clean up the references to it.
This seems to be safe because fd_worker is processed before fd_signal in
the main event loop, so any last words will be duly recorded.
- Reorder the event loop so that signals and the control socket are
processed before the event queue. This way, udev would never dispatch a
new event to a worker after it's been told to exit, which is what happens
now. However, there may be reasons for the current order that I don't
see offhand; and anyway, I don't think the reordering is a guarantee that
the event and signal wouldn't arrive at the same time, it just makes it
orders of magnitude less likely.
- When a worker sees that an event has arrived, process it immediately.
This ensures everything udevd has delegated is finished before the worker
handles the signal (i.e., exits). This means the worker will take longer
to finish up in some cases, but probably not 60 seconds longer...
The below patch implements option three. Maybe one of the other options
should also be implemented, but in my testing #3 seems to be sufficient in
its own right to solve the problem.
An implicit assumption in this patch is that, when an event and signal
arrive together, the event is returned *first* from epoll_wait(). This has
been the case in my tests, but I haven't spotted any reason that it's
guaranteed to always be the case.
Cheers,
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek@ubuntu.com vorlon@debian.org
From 84ead317ea041efa066b5134d5df0fd6fc075aa9 Mon Sep 17 00:00:00 2001
From: Steve Langasek <steve.langasek@canonical.com>
Date: Sat, 8 Oct 2011 01:34:32 -0700
Subject: [PATCH] Process events before signals in worker
When a worker receives both a signal and a udev event in the same epoll_wait
run, the event must be processed first because the udev parent considers the
event already dispatched. If we process the signal first and exit, udevd
times out after 60 seconds waiting for a response from an already-dead
worker.
Ref: https://bugs.launchpad.net/bugs/818177
Signed-off-by: Steve Langasek <steve.langasek@canonical.com>
---
udev/udevd.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/udev/udevd.c b/udev/udevd.c
index 77aec9d..b65b53f 100644
--- a/udev/udevd.c
+++ b/udev/udevd.c
@@ -347,6 +347,7 @@ static void worker_new(struct event *event)
for (i = 0; i < fdcount; i++) {
if (ev[i].data.fd == fd_monitor && ev[i].events & EPOLLIN) {
dev = udev_monitor_receive_device(worker_monitor);
+ break;
} else if (ev[i].data.fd == fd_signal && ev[i].events & EPOLLIN) {
struct signalfd_siginfo fdsi;
ssize_t size;
--
1.7.5.4
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] fix a race condition on udev worker shutdown
2011-10-08 9:14 [PATCH] fix a race condition on udev worker shutdown Steve Langasek
@ 2011-10-09 2:56 ` Kay Sievers
0 siblings, 0 replies; 2+ messages in thread
From: Kay Sievers @ 2011-10-09 2:56 UTC (permalink / raw)
To: linux-hotplug
On Sat, Oct 8, 2011 at 11:14, Steve Langasek
<steve.langasek@canonical.com> wrote:
> Subject: [PATCH] Process events before signals in worker
>
> When a worker receives both a signal and a udev event in the same epoll_wait
> run, the event must be processed first because the udev parent considers the
> event already dispatched. Â If we process the signal first and exit, udevd
> times out after 60 seconds waiting for a response from an already-dead
> worker.
Applied.
Thanks,
Kay
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-10-09 2:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-08 9:14 [PATCH] fix a race condition on udev worker shutdown Steve Langasek
2011-10-09 2:56 ` Kay Sievers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).