public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Monitoring filesystems / blockdevice for errors
@ 2000-12-17 14:34 Lars Marowsky-Bree
  2000-12-17 18:23 ` Mark Hahn
  0 siblings, 1 reply; 5+ messages in thread
From: Lars Marowsky-Bree @ 2000-12-17 14:34 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-kernel

Good morning,

currently, there is no way for an external application to monitor whether a
filesystem or underlaying block device has hit an error condition - internal
inconsistency, read or write error, whatever.

Short of parsing syslog messages, which isn't particularly great.

This is necessary for server monitoring in general.

I don't have a real idea how this could be added, short of adding a field to
/proc/partitions (error count) or something similiar.

Comments?

Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
Perfection is our goal, excellence will be tolerated. -- J. Yahl

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monitoring filesystems / blockdevice for errors
  2000-12-17 14:34 Monitoring filesystems / blockdevice for errors Lars Marowsky-Bree
@ 2000-12-17 18:23 ` Mark Hahn
  2000-12-17 18:43   ` Lars Marowsky-Bree
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Hahn @ 2000-12-17 18:23 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: linux-fsdevel, linux-kernel

> currently, there is no way for an external application to monitor whether a
> filesystem or underlaying block device has hit an error condition - internal
> inconsistency, read or write error, whatever.
> 
> Short of parsing syslog messages, which isn't particularly great.

what's wrong with it?  reinventing /proc/kmsg and klogd would be tre gross.

> I don't have a real idea how this could be added, short of adding a field to
> /proc/partitions (error count) or something similiar.

for reporting errors, that might be OK, but it's not a particularly nice
_notification_ mechanism...

regards, mark hahn.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re:  Monitoring filesystems / blockdevice for errors
  2000-12-17 18:23 ` Mark Hahn
@ 2000-12-17 18:43   ` Lars Marowsky-Bree
  2000-12-18  5:28     ` Peter Samuelson
  0 siblings, 1 reply; 5+ messages in thread
From: Lars Marowsky-Bree @ 2000-12-17 18:43 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-fsdevel, linux-kernel

On 2000-12-17T13:23:52,
   Mark Hahn <hahn@coffee.psychology.mcmaster.ca> said:

> > Short of parsing syslog messages, which isn't particularly great.
> what's wrong with it?

Because it means having to know about all potential messages the filesystems
might dump out.

> reinventing /proc/kmsg and klogd would be tre gross.

Well, only one process can read kmsg and get notified about new messages at
any time, so that makes the monitoring depend on klogd/syslogd working, which
given a write error by syslog might not be the case...

> > I don't have a real idea how this could be added, short of adding a field to
> > /proc/partitions (error count) or something similiar.
> for reporting errors, that might be OK, but it's not a particularly nice
> _notification_ mechanism...

Well, yes.

Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
Perfection is our goal, excellence will be tolerated. -- J. Yahl

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monitoring filesystems / blockdevice for errors
  2000-12-17 18:43   ` Lars Marowsky-Bree
@ 2000-12-18  5:28     ` Peter Samuelson
  2000-12-18  8:46       ` Jan-Benedict Glaw
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Samuelson @ 2000-12-18  5:28 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Mark Hahn, linux-fsdevel, linux-kernel


  [Mark Hahn]
> > reinventing /proc/kmsg and klogd would be tre gross.

[Lars Marowsky-Bree]
> Well, only one process can read kmsg and get notified about new
> messages at any time, so that makes the monitoring depend on
> klogd/syslogd working, which given a write error by syslog might not
> be the case...

So rewrite klogd to do something much simpler for serious errors (yes
they will be tagged as such) before trying to pass them on to syslogd.
Or does it already do this?  It's a userspace problem.

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monitoring filesystems / blockdevice for errors
  2000-12-18  5:28     ` Peter Samuelson
@ 2000-12-18  8:46       ` Jan-Benedict Glaw
  0 siblings, 0 replies; 5+ messages in thread
From: Jan-Benedict Glaw @ 2000-12-18  8:46 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1938 bytes --]

On Sun, Dec 17, 2000 at 11:28:46PM -0600, Peter Samuelson wrote:
>   [Mark Hahn]
> > > reinventing /proc/kmsg and klogd would be tre gross.
> 
> [Lars Marowsky-Bree]
> > Well, only one process can read kmsg and get notified about new
> > messages at any time, so that makes the monitoring depend on
> > klogd/syslogd working, which given a write error by syslog might not
> > be the case...
> 
> So rewrite klogd to do something much simpler for serious errors (yes
> they will be tagged as such) before trying to pass them on to syslogd.
> Or does it already do this?  It's a userspace problem.

Hmmm... Even if LMB and I are often of quite different oppinions, I
think only modifying klogd is not enough. LMB stated that a userspace
tool would need to know any possibly error messages that could
possibly generated. Cleaning up all messages would be the first
step to prepare for failure reports to userspace. Ie, what errors do
re have?

	- Sense errors (recoverable)            \
	-    "    "    (unrecoverable)           > for all kinds of devices
	- Complete device failure (HDD is gone) /
	- Data failure (wrong ext2 bitmaps) for all FS
	- RAM's ECC/parity errors
	- possibly some more;)

Cleaning up all error messages (maybe using exctly two lines: one for kind
of failure, one for device/RAM/fs specific messages) could help a lot
and doesn't hurt badly (code doesn't get really slower as these paths
are more-or-less never taken; but there is a little bit more bloat...).

With such an infrastructure, klogd could pass those lines to an external
helper (and additionally to syslog).

MfG, JBG

-- 
Fehler eingestehen, Größe zeigen: Nehmt die Rechtschreibreform zurück!!!
/* Jan-Benedict Glaw <jbglaw@lug-owl.de> -- +49-177-5601720 */
keyID=0x8399E1BB fingerprint=250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB
     "insmod vi.o and there we go..." (Alexander Viro on linux-kernel)

[-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2000-12-18  9:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-12-17 14:34 Monitoring filesystems / blockdevice for errors Lars Marowsky-Bree
2000-12-17 18:23 ` Mark Hahn
2000-12-17 18:43   ` Lars Marowsky-Bree
2000-12-18  5:28     ` Peter Samuelson
2000-12-18  8:46       ` Jan-Benedict Glaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox