From: Hong Tran Duc <hongtd2k@gmail.com>
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-ide@vger.kernel.org
Subject: Oops when read/write or mount/unmount continuously ~ 600.000 times
Date: Sun, 03 Aug 2008 19:49:50 +0700 [thread overview]
Message-ID: <4895A96E.2040303@gmail.com> (raw)
Hi all,
I’m using kernel 2.4.20 with fully preemptive enable (patch & set the
CONFIG option). My CPU is PowerPC 750FX, HDD 80GB, RAM 512,
I got many Oops when try to mount/unmount or read/write on ATA HDD
continuously about 600.000 times (in several hours). Oops often occurred
when CPU trap SIGSEGV or SIGILL, sometime on page management module,
sometimes on scheduler, block I/O manipulation, filesystem.
The most frequently happened on:
Block I/O : make_request, generic_make_request, submit_bh, bdfind, bmap,
__wait_on_buffer ..
Filesystem: journal_commit_transaction, kill_super, invalidate_inode,
invalidate_list ..
The reasons is almost linked list on those function was broken. Ex:
linkedlist->next linkedlist->prev = NULL or set to invalid address.
In the situation SIGILL, the instruction pointer (NIP) is same as the
return address register (LR).
The newest Oops, I got on function __wait_on_buffer(). The main
sequences of __wait_on_buffer() are:
1. put_bh -> atomic_inc(bh->b_count);
2. add wait queue
3. loop: do some thing task manipulation, call *schedule()*
4. remove wait queue
5. get_bh -> atomic_dec(bh->b_count); *<- Got Oops here, SEGV because
bh->b_count = R25 = 0x02 *
After analysis assembly code (I upload on pastebin bellow) at this
point, I found that:
* At the point (1) -> address of bh->b_count stored in register r25.
* The point from (2) ->(4) all of affect to register 25 will be restored
from stack (r25 act as non violent register in gcc ABI).
* An step 5, *r25 = 0x02 ??? I don’t know why r25 is changed ? May be
stack on somewhere was corrupted ?*
This Oops is very difficult to replicate (about 2 hours run stress test
program). I try to increase/reduce the HZ 10 times, but the frequency of
bug is no change. And, I tried on ext2/ext3, it’s same result.
I’m really confusing now, I don’t know where the real problem is, and
what is effected with the frequency of Oops, how to debug and figure
this bug ?
I post my situation to this ML and hope to get some advice from you,
Some Oops, I uploaded on pastebin here:
http://vnoss.net/p/5783
http://vnoss.net/p/5785
Sources and assembly of __wait_on_buffer()
http://vnoss.net/p/5784
Thanks for your help,
--
nm.
GPG Key ID: 0xDD253B25
Fingerprint: 2B17 D64A 9561 A443 2ABC 1302 4641 D0B7 DD25 3B25
next reply other threads:[~2008-08-03 12:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-03 12:49 Hong Tran Duc [this message]
2008-08-03 13:38 ` Oops when read/write or mount/unmount continuously ~ 600.000 times Matthew Wilcox
2008-08-03 15:18 ` Hong Tran Duc
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4895A96E.2040303@gmail.com \
--to=hongtd2k@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.