From: Kay Sievers <kay.sievers@vrfy.org>
To: linux-hotplug@vger.kernel.org
Subject: Re: Hanging udev process on nfs-mounted /dev
Date: Fri, 01 Oct 2004 08:08:47 +0000 [thread overview]
Message-ID: <1096618128.4295.47.camel@localhost.localdomain> (raw)
In-Reply-To: <415980BF.1020401@bio.ifi.lmu.de>
On Fri, 2004-10-01 at 09:38 +0200, Frank Steiner wrote:
> Hi,
>
> here we go :-) On reboot, one of the clients ran into the haning
> udev process. Althoug the timeout patch was applied, the hanging
> udev process was not killed.
That's ok. The signal handler does not kill the process. It is just a
timeout to interrupt a system call waiting for the kernel. The tdb code
return unsuccessful if it catches that timeout. The hanging udev version
is spinning by itself (not hanging in a system call) and therefore will
do that forever.
> But it blocked a lot of other processes because there are messages
> about "timeout reached" in /var/log/messages. I had to reboot the
> PC (the professors client :-)), but I tried to collect all information
> that might be helpful.
Yes, sure, it is. We're getting closer.
> I've put all the logs on a website. They include /var/log/messages
> from the point where the system bootet until it hung, a "ps -aux" output
> while udev was hanging, and the straces for all udev processes started
> during the boot. Recall that I replaced /sbin/udev{start} by
>
> strace -o /var/log/udev.log.`uname -n`.${$} -f /sbin/utest/`basename $0` $@
>
> and moved the original udev and udevstart to /sbin/utest/.
> All the information is here: http://www.bio.ifi.lmu.de/~steiner/udev/
> The udev traces are sorted in "ls -lat" order.
>
> The udev process that was hanging had pid 9700. The matching strace
> is udev.log.noether.9652. After calling "pkill udev" to make the
> host usable again, three straces were changed. Those are listed
> with both versions, so that one can see what happened after killing
> (don't know if this helps). Again, the hanging udev process hung
> after F_SETLKW:
> ...
> 9700 fcntl64(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start(8, len=1}) = 0
> 9700 fcntl64(5, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, startt924, len=1}) = 0
> 9700 fcntl64(5, F_SETLK, {type=F_UNLCK, whence=SEEK_SET, startt924, len=1}) = 0
> 9700 fcntl64(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start\x164, len=1}) = 0
> 9700 --- SIGALRM (Alarm clock) @ 0 (0) ---
> 9700 time([1096612648]) = 1096612648
> 9700 rt_sigaction(SIGPIPE, {0x40116ae0, [], SA_RESTORER, 0x40067aa8}, {SIG_DFL}, 8) = 0
> 9700 send(0, "<14>Oct 1 08:37:28 udev: error:"..., 137, 0) = 137
> 9700 rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0
> 9700 sigreturn() = ? (mask now [])
Yes, that's the fault. Seems that this process locks the db-file and
then keeps spinning forever without doing system calls. It's just a loop
inside of the tdb code. It consumed a lot of your CPU:
> root 9688 0.0 0.0 1696 600 ? S< 08:37 0:00 strace -o /var/log/udev.log.noether.9652 -f /sbin/utest/udev scsi_generic
> root 9700 99.9 0.0 1664 604 ? R< 08:37 17:37 /sbin/utest/udev scsi_generic
Thanks,
Kay
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel
next prev parent reply other threads:[~2004-10-01 8:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-28 15:18 Hanging udev process on nfs-mounted /dev Frank Steiner
2004-09-29 17:18 ` Greg KH
2004-09-29 23:39 ` Kay Sievers
2004-09-30 2:11 ` Kay Sievers
2004-09-30 6:18 ` Frank Steiner
2004-09-30 6:21 ` Frank Steiner
2004-09-30 14:07 ` Kay Sievers
2004-10-01 6:25 ` Frank Steiner
2004-10-01 7:36 ` Kay Sievers
2004-10-01 7:38 ` Frank Steiner
2004-10-01 7:55 ` Frank Steiner
2004-10-01 8:08 ` Kay Sievers [this message]
2004-10-01 9:43 ` Frank Steiner
2004-10-01 9:57 ` Kay Sievers
2004-10-01 10:43 ` Kay Sievers
2004-10-01 22:18 ` Kay Sievers
2004-10-03 21:10 ` Frank Steiner
2004-10-03 23:07 ` Kay Sievers
2004-10-04 6:15 ` Frank Steiner
2004-10-04 14:19 ` Kay Sievers
2004-10-04 14:53 ` Frank Steiner
2004-10-05 15:37 ` Kay Sievers
2004-10-06 6:06 ` Frank Steiner
2004-10-06 12:00 ` Kay Sievers
2004-10-06 12:29 ` Frank Steiner
2004-10-08 5:59 ` Frank Steiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1096618128.4295.47.camel@localhost.localdomain \
--to=kay.sievers@vrfy.org \
--cc=linux-hotplug@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).