linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kay Sievers <kay.sievers@vrfy.org>
To: linux-hotplug@vger.kernel.org
Subject: Re: Hanging udev process on nfs-mounted /dev
Date: Fri, 01 Oct 2004 08:08:47 +0000	[thread overview]
Message-ID: <1096618128.4295.47.camel@localhost.localdomain> (raw)
In-Reply-To: <415980BF.1020401@bio.ifi.lmu.de>

On Fri, 2004-10-01 at 09:38 +0200, Frank Steiner wrote:
> Hi,
> 
> here we go :-) On reboot, one of the clients ran into the haning
> udev process. Althoug the timeout patch was applied, the hanging
> udev process was not killed.

That's ok. The signal handler does not kill the process. It is just a
timeout to interrupt a system call waiting for the kernel. The tdb code
return unsuccessful if it catches that timeout. The hanging udev version
is spinning by itself (not hanging in a system call) and therefore will
do that forever.

> But it blocked a lot of other processes because there are messages
> about "timeout reached" in /var/log/messages. I had to reboot the
> PC (the professors client :-)), but I tried to collect all information
> that might be helpful.

Yes, sure, it is. We're getting closer.

> I've put all the logs on a website. They include /var/log/messages
> from the point where the system bootet until it hung, a "ps -aux" output
> while udev was hanging, and the straces for all udev processes started
> during the boot. Recall that I replaced /sbin/udev{start} by
> 
> strace -o /var/log/udev.log.`uname -n`.${$} -f /sbin/utest/`basename $0` $@
> 
> and moved the original udev and udevstart to /sbin/utest/.
> All the information is here: http://www.bio.ifi.lmu.de/~steiner/udev/
> The udev traces are sorted in "ls -lat" order.
> 
> The udev process that was hanging had pid 9700. The matching strace
> is udev.log.noether.9652. After calling "pkill udev" to make the
> host usable again, three straces were changed. Those are listed
> with both versions, so that one can see what happened after killing
> (don't know if this helps). Again, the hanging udev process hung
> after F_SETLKW:
> ...
> 9700  fcntl64(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start(8, len=1}) = 0
> 9700  fcntl64(5, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, startt924, len=1}) = 0
> 9700  fcntl64(5, F_SETLK, {type=F_UNLCK, whence=SEEK_SET, startt924, len=1}) = 0
> 9700  fcntl64(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start\x164, len=1}) = 0
> 9700  --- SIGALRM (Alarm clock) @ 0 (0) ---
> 9700  time([1096612648])                = 1096612648
> 9700  rt_sigaction(SIGPIPE, {0x40116ae0, [], SA_RESTORER, 0x40067aa8}, {SIG_DFL}, 8) = 0
> 9700  send(0, "<14>Oct  1 08:37:28 udev: error:"..., 137, 0) = 137
> 9700  rt_sigaction(SIGPIPE, {SIG_DFL}, NULL, 8) = 0
> 9700  sigreturn()                       = ? (mask now [])

Yes, that's the fault. Seems that this process locks the db-file and
then keeps spinning forever without doing system calls. It's just a loop
inside of the tdb code. It consumed a lot of your CPU:

> root      9688  0.0  0.0  1696  600 ?        S<   08:37   0:00 strace -o /var/log/udev.log.noether.9652 -f /sbin/utest/udev scsi_generic
> root      9700 99.9  0.0  1664  604 ?        R<   08:37  17:37 /sbin/utest/udev scsi_generic

Thanks,
Kay



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

  parent reply	other threads:[~2004-10-01  8:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-28 15:18 Hanging udev process on nfs-mounted /dev Frank Steiner
2004-09-29 17:18 ` Greg KH
2004-09-29 23:39 ` Kay Sievers
2004-09-30  2:11 ` Kay Sievers
2004-09-30  6:18 ` Frank Steiner
2004-09-30  6:21 ` Frank Steiner
2004-09-30 14:07 ` Kay Sievers
2004-10-01  6:25 ` Frank Steiner
2004-10-01  7:36 ` Kay Sievers
2004-10-01  7:38 ` Frank Steiner
2004-10-01  7:55 ` Frank Steiner
2004-10-01  8:08 ` Kay Sievers [this message]
2004-10-01  9:43 ` Frank Steiner
2004-10-01  9:57 ` Kay Sievers
2004-10-01 10:43 ` Kay Sievers
2004-10-01 22:18 ` Kay Sievers
2004-10-03 21:10 ` Frank Steiner
2004-10-03 23:07 ` Kay Sievers
2004-10-04  6:15 ` Frank Steiner
2004-10-04 14:19 ` Kay Sievers
2004-10-04 14:53 ` Frank Steiner
2004-10-05 15:37 ` Kay Sievers
2004-10-06  6:06 ` Frank Steiner
2004-10-06 12:00 ` Kay Sievers
2004-10-06 12:29 ` Frank Steiner
2004-10-08  5:59 ` Frank Steiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1096618128.4295.47.camel@localhost.localdomain \
    --to=kay.sievers@vrfy.org \
    --cc=linux-hotplug@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).