From: "lxh" <lxhzju@163.com>
To: xfs <xfs@oss.sgi.com>
Subject: Re: Re: kernel oops on debian, 2.6.18-5, large xfs volume
Date: Fri, 25 Jan 2008 17:41:05 +0800 [thread overview]
Message-ID: <200801251741035934497@163.com> (raw)
Hi,
======= 2008-01-25 16:01:54 =======
>On Fri, Jan 25, 2008 at 03:16:36PM +0800, lxh wrote:
>> Hello,
>> we have dozens of file servers with a 1.5TB/2.5 TB large xfs file system
>> volume running on a RAID6 SATA array. Each volume contains about
>> 10,000,000 files. The Operating system is debian GNU/Linux 2.6.18-5-amd64
>> #1 SMP. we got a kernel oops frequently last year.
>>
>> here is the oops :
>> Filesystem "cciss/c0d1": XFS internal error xfs_trans_cancel at line 1138
>> of file fs/xfs/xfs_trans.c. Caller 0xffffffff881df006
>> Call Trace:
>> [<ffffffff881fed18>] :xfs:xfs_trans_cancel+0x5b/0xfe
>> [<ffffffff88207006>] :xfs:xfs_create+0x58b/0x5dd
>> [<ffffffff8820f496>] :xfs:xfs_vn_mknod+0x1bd/0x3c8
>
>Are you running out of space in the filesystem?
we did not run out of space. there is enough space for writing.
>
>The only vectors I've seen that can cause this are I/O errors
>or ENOSPC during file create after we've already checked that
>this cannot happen. Are there any I/O errors in the log?
>
After we run xfs_repair, it outputs nothing special.
I guess this problem be related with big volume and a mass of small files. Some servers are equipped with same hardware and software, but they are configured with 1TB volume and stored big files. This problem never happen on them.
>This commit:
>
>http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=45c34141126a89da07197d5b89c04c6847f1171a
>
>which is in 2.6.23 fixed the last known cause of the ENOSPC
>issue, so upgrading the kernel or patching this fix back
>to the 2.6.18 kernel may fix the problem if it is related to
>ENOSPC.
Thank you very much for your help! I will try this patch on some machines.
>
>> Every time the error occurs, the volume can not be accessed. So we have to
>> umount this volume, run xfs_repair, and then remount it. This problem
>> causes seriously impact of our service.
>
>Anyway, next time it happens, can you please run xfs_check on the
>filesystem first and post the output? If there is no output, then
>the filesystem is fine and you don't need to run repair.
The volume is unusable when it happens. So we run xfs_repair. The xfs_repair operation output nothing special. But after xfs_repair, we can access the volume again. I don't konw why.
>
>If it is not fine, can also post the output of xfs_repair?
>
>Once the filesystem has been fixed up, can you then post the
>output of this command to tell us the space usage in the filesystems?
>
># xfs_db -r -c 'sb 0' -c p <dev>
I will comply with the your suggestions when it happens again, and then contact you.
>
>Cheers,
>
>Dave.
>--
>Dave Chinner
>Principal Engineer
>SGI Australian Software Group
= = = = = = = = = = = = = = = = = = = =
Cheers,
Luoxiaohua
NetEase.com Inc
lxhzju@163.com
2008-01-25
next reply other threads:[~2008-01-25 9:41 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-25 9:41 lxh [this message]
-- strict thread matches above, loose matches on Subject: below --
2008-01-25 7:16 kernel oops on debian, 2.6.18-5, large xfs volume lxh
2008-01-25 8:01 ` David Chinner
2008-02-21 7:34 ` lxh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801251741035934497@163.com \
--to=lxhzju@163.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox