From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kay Sievers Date: Fri, 01 Oct 2004 07:36:33 +0000 Subject: Re: Hanging udev process on nfs-mounted /dev Message-Id: <1096616193.4295.35.camel@localhost.localdomain> List-Id: References: <415980BF.1020401@bio.ifi.lmu.de> In-Reply-To: <415980BF.1020401@bio.ifi.lmu.de> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org On Fri, 2004-10-01 at 08:25 +0200, Frank Steiner wrote: > Kay Sievers wrote > > > It would be nice to know, if there is posssibly one process spinning at > > this time, which blocks all the other processes? Or if there is a "real" > > deadlock, where all processes are blocking in the lock call. > > As far as I remember, when the udev process was running with 90% cpu time, > it was the only udev process (pgrep udev). This may be the process that blocks all the other ones. If you can find one of these beasts, please attach gdb to the running process and look if we find something in the backtrace. Here is a sample from my "lock the whole file"-test application: * [root@pim ~]# gdb -p 14727 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. Attaching to process 14727 ... Reading symbols from /home/kay/src/lock...(no debugging symbols found)...done. Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 0x0804839f in spin () * (gdb) bt #0 0x0804839f in spin () #1 0x08048405 in main () * (gdb) q The program is running. Quit anyway (and detach it)? (y or n) y Detaching from program: /home/kay/src/lock, process 14727 [root@pim ~]# > With debugging and logging enabled now (needed for your patch to compile), > I get lots of messages from udev broadcasted to every shell, which is > quite annoying for the users, because they get all their xterms filled: > > Oct 1 05:46:50 noether udevinfo[336]: rec_read bad magic 0xd9fee666 at offsety12 Oops, that is from the tdb-code and indicates a corrupt database, which is likely the reason for all the bad behavior. You may try to "rm /dev/.udev.tdb" and look if these messages are going away. The next udev run will create a new one. Does "udevinfo -d" (database dump) print anything? The /dev is stored on nfs and not cleaned and recreated with udevstart before mounting, right? So the database may be corrupt since a long time? Best, Kay ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel