From: Hong Tran Duc <hongtd2k@gmail.com>
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-ide@vger.kernel.org
Subject: Oops when read/write or mount/unmount continuously ~ 600.000 times
Date: Sun, 03 Aug 2008 19:49:50 +0700 [thread overview]
Message-ID: <4895A96E.2040303@gmail.com> (raw)
Hi all,
I’m using kernel 2.4.20 with fully preemptive enable (patch & set the
CONFIG option). My CPU is PowerPC 750FX, HDD 80GB, RAM 512,
I got many Oops when try to mount/unmount or read/write on ATA HDD
continuously about 600.000 times (in several hours). Oops often occurred
when CPU trap SIGSEGV or SIGILL, sometime on page management module,
sometimes on scheduler, block I/O manipulation, filesystem.
The most frequently happened on:
Block I/O : make_request, generic_make_request, submit_bh, bdfind, bmap,
__wait_on_buffer ..
Filesystem: journal_commit_transaction, kill_super, invalidate_inode,
invalidate_list ..
The reasons is almost linked list on those function was broken. Ex:
linkedlist->next linkedlist->prev = NULL or set to invalid address.
In the situation SIGILL, the instruction pointer (NIP) is same as the
return address register (LR).
The newest Oops, I got on function __wait_on_buffer(). The main
sequences of __wait_on_buffer() are:
1. put_bh -> atomic_inc(bh->b_count);
2. add wait queue
3. loop: do some thing task manipulation, call *schedule()*
4. remove wait queue
5. get_bh -> atomic_dec(bh->b_count); *<- Got Oops here, SEGV because
bh->b_count = R25 = 0x02 *
After analysis assembly code (I upload on pastebin bellow) at this
point, I found that:
* At the point (1) -> address of bh->b_count stored in register r25.
* The point from (2) ->(4) all of affect to register 25 will be restored
from stack (r25 act as non violent register in gcc ABI).
* An step 5, *r25 = 0x02 ??? I don’t know why r25 is changed ? May be
stack on somewhere was corrupted ?*
This Oops is very difficult to replicate (about 2 hours run stress test
program). I try to increase/reduce the HZ 10 times, but the frequency of
bug is no change. And, I tried on ext2/ext3, it’s same result.
I’m really confusing now, I don’t know where the real problem is, and
what is effected with the frequency of Oops, how to debug and figure
this bug ?
I post my situation to this ML and hope to get some advice from you,
Some Oops, I uploaded on pastebin here:
http://vnoss.net/p/5783
http://vnoss.net/p/5785
Sources and assembly of __wait_on_buffer()
http://vnoss.net/p/5784
Thanks for your help,
--
nm.
GPG Key ID: 0xDD253B25
Fingerprint: 2B17 D64A 9561 A443 2ABC 1302 4641 D0B7 DD25 3B25
next reply other threads:[~2008-08-03 12:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-03 12:49 Hong Tran Duc [this message]
2008-08-03 13:38 ` Oops when read/write or mount/unmount continuously ~ 600.000 times Matthew Wilcox
2008-08-03 15:18 ` Hong Tran Duc
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4895A96E.2040303@gmail.com \
--to=hongtd2k@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).