From mboxrd@z Thu Jan 1 00:00:00 1970 From: Frank Steiner Date: Fri, 01 Oct 2004 07:55:57 +0000 Subject: Re: Hanging udev process on nfs-mounted /dev Message-Id: <415D0D8D.3000306@bio.ifi.lmu.de> List-Id: References: <415980BF.1020401@bio.ifi.lmu.de> In-Reply-To: <415980BF.1020401@bio.ifi.lmu.de> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org Kay Sievers wrote > This may be the process that blocks all the other ones. If you can find > one of these beasts, please attach gdb to the running process and look > if we find something in the backtrace. Here is a sample from my > "lock the whole file"-test application: Arrrgggh damn! I wish I had waited a bit longer with killing the process so that I had read this mail before and could have tried the gdb :-(( But since it was the professors (my boss :-)) client, he wanted to have it back working quickly. I will try to get the lock again on another host by rebooting it over and over again, maybe I can trigger the lock. >>With debugging and logging enabled now (needed for your patch to compile), >>I get lots of messages from udev broadcasted to every shell, which is >>quite annoying for the users, because they get all their xterms filled: >> >>Oct 1 05:46:50 noether udevinfo[336]: rec_read bad magic 0xd9fee666 at offsety12 > > > Oops, that is from the tdb-code and indicates a corrupt database, which > is likely the reason for all the bad behavior. You may try to > "rm /dev/.udev.tdb" and look if these messages are going away. The next > udev run will create a new one. Hmm, this sounds like the problem is NFS without locking. Maybe two processes indeed write concurrently to the database, thus corrupting it. That would also explain why I don't see any of these messages on the tmpfs hosts. I wish there was a solution for nfsroot with nfs locking :-( > Does "udevinfo -d" (database dump) print anything? Not very much, just 4 entries: noether /var/log# udevinfo -d P: /block/fd0 N: fd0 T: b M: 060660 S: O: root G: disk F: L: 0 U: 55 P: /block/loop4 N: loop4 T: b M: 060660 S: O: root G: disk F: L: 0 U: 55 P: /block/ram0 N: ram0 T: b M: 060660 S: O: root G: disk F: L: 0 U: 56 P: /class/scsi_generic/sg0 N: sg0 T: c M: 020640 S: by-path/usb-storage-00000000710D:0:0:0-generic O: root G: disk F: /etc/udev/rules.d/udev.rules L: 6 U: 498 noether /var/log# Message from syslogd@noether at Fri Oct 1 09:44:59 2004 ... noether udevinfo[11629]: rec_read bad magic 0xd9fee666 at offsetc176 And that's it. On a hardware-identical host with /dev being a tmpfs, I have a bout 180 entries! > > The /dev is stored on nfs and not cleaned and recreated with udevstart > before mounting, right? So the database may be corrupt since a long > time? boot.udev is run on boot, so it recreates the database on every start, and thus, it looks like it gets corrupted again on almost every boot. Note that my scenario here is a little bit mixed because I just started using udev by backporting the hotplug stuff from SuSE 9.1 to SuSE 9.0. But SuSE 9.1 is still using a static /dev and udev just for certain things like hotplugging of e.g. usb devices or pktsetup etc. Since most of the boot script from SuSE are not prepared for working with an empty /dev, many devices are missing if I run udevstart on an empty /dev. E.g, things like /dev/stdin etc. Because I didn't want to hack every SuSE script, I kept their static devices (from a devs.rpm) but boot.udev is still running. I can try to reproduce it on another host to get the gdb stuff, but I feel pretty sure now that it is problem with the "nolock" mount option for the NFS-based /dev... cu, Frank -- Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/ Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/ LMU, Amalienstr. 17 Phone: +49 89 2180-4049 80333 Muenchen, Germany Fax: +49 89 2180-99-4049 * Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. * ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel