From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kay Sievers Date: Tue, 03 Oct 2006 21:52:23 +0000 Subject: Re: Hardware error reporting [was Re: PCI Error reporting] Message-Id: <1159912343.3427.43.camel@localhost> List-Id: References: <20061003152636.GA4381@austin.ibm.com> In-Reply-To: <20061003152636.GA4381@austin.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable To: linux-hotplug@vger.kernel.org On Tue, 2006-10-03 at 11:26 -0500, Linas Vepstas wrote: > On Tue, Oct 03, 2006 at 05:57:20PM +0200, Kay Sievers wrote: > >=20 > > Error classification/reporting is a completely missing piece in Linux. > > Today there is no sane example of error reporting in the Linux kernel. > > Printk and friends are totally useless for anything else than the geek > > in front of the computer. Until the kernel gets a sane error > > classification/reporting infrastructure, it's impossible to solve such a > > problem. > >=20 > > And just in case: using the driver-core event-infrastucture (udev) is > > the totally wrong approach to relay kernel errors to userspace. >=20 > So what's the right approach?=20 >=20 > Historically, I notice there was an attempt called "evlog" > (http://evlog.sourceforge.net/) which bombed out; the latest > patches were to 2.6.4 from 2005. It would need several pieces: A transport, that can safely relay binary data like hardware data, firmware dumps, sense codes... It would need to be reliable from early-boot on, without overwriting its own data like the kernel log buffer. Maybe a debugfs/relayfs can be used here. Some sort of event channel, to wake up userspace that something happended. The driver core uevents are usually to heavy for such a thing. We could make them fit the need, but it would need to change the netlink interface and udev, because the current one, which udev uses can't do that. We would need to define the properties the "error event" should carry. Usually the DEVPATH of the device, but not all errors come from something which is registered with the driver core. In most cases the DEVPATH should still a good way to associate the event with a well known device in userspace. The event must also carry some sort of classification, that goes beyond the personal taste of the author of the driver. It must be something generic, that can be interpreted by software and not only by human beings. It must be well defined for every class of device or subsystem.=20 The problem is that such a infrastructure needs a lot of work on the communication side to collect the requirements and bring subsystem maintainers together. In that area, Linux is pretty hard to handle, and the one who is willing do this, will need a lot of patience convincing people that this is needed and more useful than the current horrible printk solution. This may be the hardest part of that job. Thanks, Kay ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=DEVD= EV _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel