From: Stan Hoeppner <stan@hardwarefreak.com>
To: Rotem Ben Arye <rotem.benarye@gmail.com>,
"xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: XFS File System Monitor
Date: Sat, 04 Jan 2014 09:45:06 -0600 [thread overview]
Message-ID: <52C82C82.8020801@hardwarefreak.com> (raw)
In-Reply-To: <CA+apj_iOy2dqyPGunKe91WLCqy71uE1uq2HZQ_v=+QHewCymeA@mail.gmail.com>
On 1/4/2014 2:21 AM, Rotem Ben Arye wrote:
> Hi Stan,
> Thank you for focused answer ,only to realize , i'm aware that monitor snmp
> checks On /var/log/messages
> Of Production server in power failure ,so that after that will tell us -
> "power outage causes to a file system corruption" is useless.
>
> But for all those cases that you mention " bugs in the XFS code or
> elsewhere in the Linux kernel , transient or permanent hardware failures"
> Is there no some suitable log that we can track to get an indication of
> kind event that you specified .
Errors due to such problems are logged in dmesg. Hardware problems will
usually show up as IO errors generated from the device on which XFS
resides. When these occur XFS will typically initiate automatic
shutdown of the filesystem to prevent (further) corruption. In this
case the log entry occurs simultaneously with the shutdown, so
monitoring logs won't notify you in advance of this problem. Monitoring
your hardware may.
If you get corruption due to a software bug, you may not see an error in
the log until after the filesystem suffers the corruption event.
Usually when you see errors of this nature in the log it is because
corruption has already occurred, possibly long ago, but is just now
being detected by code specifically added to XFS to detect such things.
For example, say your filesystem is 3 years old, corruption occurred in
year one, and an update to XFS 2 years later looks for such corruption
whereas before it did not. Depending on the severity of the corruption,
xfs_repair may be able to fix it, or it may not. If not you ask for
help here.
So again, I'm not away of any proactive monitoring that would help in
these situations. Of course it would be nice to know if something is
going to fail beforehand, but this isn't always possible, unfortunately.
--
Stan
> Thank you.
>
>
>
> On Thu, Jan 2, 2014 at 5:07 PM, Stan Hoeppner <stan@hardwarefreak.com>wrote:
>
>> On 1/2/2014 6:16 AM, Rotem Ben Arye wrote:
>>> Hi, SGI Support Team.
>>> My Name is Rotem , I am a Linux/Unix System Administrator in web company
>> at
>>> Israel.
>>> I have a question I want to appeal to you to get some advice.
>>>
>>> In the last weekend we had crisis in one of the Production server in
>>> the comany ,the problem was defined by the Integrators as "xfs file
>> system
>>> corrupted"
>>> My question is , what are the open source tools , that we can use on
>>> runtime at production environment , to monitor and sample to get
>> indication
>>> on mount XFS ,
>>> That something is not living well, and can lead to problem.
>>>
>>> We are working in a Linux environment on CentOS distributions server.
>>
>> So in a nutshell you're looking for a monitor application that will in
>> essence give you a green, yellow, or red light informing you of the
>> filesystem's health. Or some kind of SNMP logging that suggests a
>> corruption is imminent.
>>
>> There is no such tool, and never will be. Nearly all XFS corruption
>> events are caused by either software bugs in the XFS code or elsewhere
>> in the Linux kernel, transient or permanent hardware failures, or power
>> failures, at some layer in the storage stack. It is not feasible to
>> predict such events.
>>
>> When an XFS corruption occurs, one should report all related log
>> information and errors to this list so that the problem may be analyzed
>> and the root cause identified. Then the proper corrective action can be
>> identified and implemented to fix the problem and hopefully prevent it
>> from reoccurring.
>>
>> --
>> Stan
>>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-01-04 15:45 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-02 12:16 XFS File System Monitor Rotem Ben Arye
2014-01-02 15:07 ` Stan Hoeppner
[not found] ` <CA+apj_iOy2dqyPGunKe91WLCqy71uE1uq2HZQ_v=+QHewCymeA@mail.gmail.com>
2014-01-04 15:45 ` Stan Hoeppner [this message]
-- strict thread matches above, loose matches on Subject: below --
2014-01-02 13:04 support
2014-01-02 13:20 ` Rotem Ben Arye
2014-01-02 13:27 ` Emmanuel Florac
2014-01-03 18:24 ` Eric Sandeen
2014-01-04 7:27 ` Rotem Ben Arye
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52C82C82.8020801@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=rotem.benarye@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).