* AMD64/Reiser4 testing and problems
@ 2004-12-02 23:13 Isaac Chanin
2004-12-04 21:27 ` Alex Zarochentsev
0 siblings, 1 reply; 6+ messages in thread
From: Isaac Chanin @ 2004-12-02 23:13 UTC (permalink / raw)
To: reiserfs-list
Hi,
I did some testing with Resier4 on AMD and was wondering if perhaps
debug information or anything on this could help with getting Resier4
stable and working on AMD64.
I have read that if AMD would give an AMD64 cpu that would be a big
help, but it doesn't seem inheriantly impossible to fix the problem from
error reports and such.
Anyways, here are the results for my testing. I used a mm 2.6.10-rc2
kernel and the filesystem was in a file mounted by loopback.
http://users.wpi.edu/~chanin/r4log.txt
The commands I tried were as follows:
(r4 was a 512mb file made by dd)
mount -o loop /root/r4 /root/r4dir (worked)
df -h (worked)
cd /root/r4 (worked)
ls (worked)
ls -la (worked)
mkdir linux (worked)
cp -r /usr/src/linux linux (worked, but the program hung at the end,
kill -9 had no effect)
ls (worked, but the program hung at the end, kill -9 had no effect)
cd /root/r4 (no output, program crashed immediately)
Thanks for taking the time to read over the problems - and if there's
anymore testing I could do or whatnot just ask; and, thanks for making
what will (hopefully) soon be the fastest file system to work on my
computer!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD64/Reiser4 testing and problems
2004-12-02 23:13 AMD64/Reiser4 testing and problems Isaac Chanin
@ 2004-12-04 21:27 ` Alex Zarochentsev
0 siblings, 0 replies; 6+ messages in thread
From: Alex Zarochentsev @ 2004-12-04 21:27 UTC (permalink / raw)
To: Isaac Chanin; +Cc: reiserfs-list
Hello Isaac
On Thu, Dec 02, 2004 at 06:13:39PM -0500, Isaac Chanin wrote:
> Hi,
>
> I did some testing with Resier4 on AMD and was wondering if perhaps
> debug information or anything on this could help with getting Resier4
> stable and working on AMD64.
>
> I have read that if AMD would give an AMD64 cpu that would be a big
> help, but it doesn't seem inheriantly impossible to fix the problem from
> error reports and such.
>
> Anyways, here are the results for my testing. I used a mm 2.6.10-rc2
> kernel and the filesystem was in a file mounted by loopback.
>
> http://users.wpi.edu/~chanin/r4log.txt
thanks a lot for the report. can you try the following patch?
===== plugin/space/bitmap.c 1.183 vs edited =====
--- 1.183/plugin/space/bitmap.c Wed Oct 13 17:22:01 2004
+++ edited/plugin/space/bitmap.c Sun Dec 5 00:18:55 2004
@@ -170,7 +170,7 @@
static int
find_next_zero_bit_in_word(ulong_t word, int start_bit)
{
- unsigned int mask = 1 << start_bit;
+ ulong_t mask = 1 << start_bit;
int i = start_bit;
while ((word & mask) != 0) {
@@ -234,7 +234,7 @@
/* search for the first set bit in single word. */
static int find_last_set_bit_in_word (ulong_t word, int start_bit)
{
- unsigned bit_mask;
+ ulong_t bit_mask;
int nr = start_bit;
assert ("zam-965", start_bit < BITS_PER_LONG);
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD64/Reiser4 testing and problems
@ 2004-12-08 8:04 Isaac Chanin
0 siblings, 0 replies; 6+ messages in thread
From: Isaac Chanin @ 2004-12-08 8:04 UTC (permalink / raw)
To: reiserfs-list
Hey Alex,
Thanks for the patch, I wish there was more I could say, but so far I
haven't been able to produce any problems whatsoever with this new patch
on reiser4/amd64. The things I tried were Jake Maciejewski's kernel
compiling command which I modified slightly to better suit my system.
The following ran for at least 30 minutes and a couple of iterations of
kernel compiling: for i in `seq 1 20` ; do make mrproper ; cat
/boot/2.6.10-r4-mm-config > .config ; make ; echo $i ; done & for i in
`seq 1 5` ; do dd; if=/dev/zero of=large_file bs=1M count=20k ; rm
large_file ; echo $i ; done
I also tried filling the filesystem, and testing for data retention from
forced umounts. For fun I also tried running ext2 and a second resier4
mounted on files inside the reiser4 filesystem. Everything worked fine,
I was unable to produce a single error in the logs.
The only thing that could possibly have been said to have gone wrong,
was when I tried force umounting with open file handles, that didn't
work out quite as well - however, for all sane usage resier4 seems
rather stable on amd64 to me.
I will try running an entire linux system on it once I get the time, and
perhaps then will be able to give you some more feedback.
So once again, thanks for the patch, and best of luck with resier4,
Isaac
Alex Zarochentsev wrote:
> Hello Isaac
>
> On Thu, Dec 02, 2004 at 06:13:39PM -0500, Isaac Chanin wrote:
>
>> Hi,
>>
>> I did some testing with Resier4 on AMD and was wondering if perhaps
debug information or anything on this could help with getting Resier4
stable and working on AMD64.
>>
>> I have read that if AMD would give an AMD64 cpu that would be a big
help, but it doesn't seem inheriantly impossible to fix the problem from
error reports and such.
>>
>> Anyways, here are the results for my testing. I used a mm
2.6.10-rc2 kernel and the filesystem was in a file mounted by loopback.
>>
>> http://users.wpi.edu/~chanin/r4log.txt
>
>
>
> thanks a lot for the report. can you try the following patch?
>
> ===== plugin/space/bitmap.c 1.183 vs edited =====
> --- 1.183/plugin/space/bitmap.c Wed Oct 13 17:22:01 2004
> +++ edited/plugin/space/bitmap.c Sun Dec 5 00:18:55 2004
> @@ -170,7 +170,7 @@
> static int
> find_next_zero_bit_in_word(ulong_t word, int start_bit)
> {
> - unsigned int mask = 1 << start_bit;
> + ulong_t mask = 1 << start_bit;
> int i = start_bit;
>
> while ((word & mask) != 0) {
> @@ -234,7 +234,7 @@
> /* search for the first set bit in single word. */
> static int find_last_set_bit_in_word (ulong_t word, int start_bit)
> {
> - unsigned bit_mask;
> + ulong_t bit_mask;
> int nr = start_bit;
>
> assert ("zam-965", start_bit < BITS_PER_LONG);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD64/Reiser4 testing and problems
@ 2004-12-21 5:46 Isaac Chanin
2004-12-22 5:45 ` Isaac Chanin
0 siblings, 1 reply; 6+ messages in thread
From: Isaac Chanin @ 2004-12-21 5:46 UTC (permalink / raw)
To: reiserfs-list
Hello Alex et al,
After my last e-mail was so useless I figured I would follow it up with
a more useful one, so what I've attempted to do is build and entire AMD64
system on top of reiser4 (aside from boot, still ext2 there), and, in the
process I've run into two more bugs.
The first bug would give output (without resier4 debugging options
enabled) like:
Dec 20 02:33:14 [kernel] reiser4[portageq(31681)]: traverse_tree
(fs/reiser4/search.c:789)[nikita-373]:
Dec 20 02:33:14 [kernel] reiser4[portageq(31681)]: traverse_tree
(fs/reiser4/search.c:755)[nikita-1481]:
over and over until the system was rebooted (whatever process - it has so
far happened with portageeq, wget, and i think nano - becomes completely
unresponsive to all kill signals and takes 100% cpu time.)
So far I have absolutely no clue when this bug occurs, as it seems to pop
up completely randomly (copying a large file from one reiser4 logical
partition to another, wget'ing a tiny file to a reiser4 partition.) About
the only connection I could find was with writing to the filesystem.
I could only produce this bug on 2.6.10-rc2-mm4; linux-2.6.10-rc3-mm1
(both with Alex's patch) would boot (onto reiser3) but wouldn't do much
else.
This leads nicely into the other bug, which is that it does not seem
possible to boot into a resier4 system (ext2 for boot, resier4 for /, et
cetera) on an AMD64. I don't have any log output as the system hangs
before it can load the logger, and no screen output because my laptop (no
matter what framebuffer configuration I use) has a completely black boot-up
screen until after the initial kernel loading is complete. Neither
2.6.10-rc2-mm4 nor linux-2.6.10-rc3-mm1 can get any farther in the booting
process either.
It occured to me that this could simply be happeneing because of all of
the failures the partitions had while getting the system installed on them
(though fsck.resier4 seemed to handle them all nicely, at the time.) So
perhaps this bug is not really a bug, and more of a misconfiguration on my
part - just a thought.
I'll put some full (with reiser4 debug options enabled in the kernel -
still can't believe I forgot those) logs up at
http://users.wpi.edu/~chanin/newr4log.txt once I can get the first bug to
occur again (or if I can somehow get some output from the second.) I
should be able to get them up later tonight and if not then, probably
sometime tomorrow.
Thanks,
Isaac
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD64/Reiser4 testing and problems
2004-12-21 5:46 Isaac Chanin
@ 2004-12-22 5:45 ` Isaac Chanin
0 siblings, 0 replies; 6+ messages in thread
From: Isaac Chanin @ 2004-12-22 5:45 UTC (permalink / raw)
To: reiserfs-list
Hi everybody,
I have a few updates. The second bug that i mentioned
earlier has been dismissed, it is definitely possible to boot into a
reiser4 system on AMD64. The first bug is very hard to replicate, I've
tried everything that caused it the first time without any luck - I'm
starting to think that it must be caused by some race condition (and hence
comes up randomly) or one of the debug options prevents it from happening
(however unlikely that may be.)
However, it may have shown it's face again - after successfully compiling
everything from glibc to xfce4 I finally got another reiser4 failure. It
happened pretty much like so:
---------------------------------------------------------
root /etc/X11 # /etc/init.d/gpm stop
reiser4 panicked cowardly: assertion failed:
reiser4_find_next_zero_bit(bnode_working_data(bnode), end_offset,
start_offset) >= end_offset
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at debug:131
invalid operand: 0000 [1]
CPU 0
Modules linked in:
Pid: 6648, comm: runscript.sh Tainted: G M 2.6.10-rc2-mm4
RIP: 0010:[<ffffffff801a0070>] <ffffffff801a0070>{reiser4_do_panic+768}
RSP: 0018:ffff81003d5497c8 EFLAGS: 00010246
RAX: ffffffff80564ac0 RBX: ffff81003d549da8 RCX: ffffffff80564ac0
RDX: ffff81003d549e78 RSI: ffffffff80564ac0 RDI: ffff81003f1e9400
RBP: ffff81003f1e9400 R08: 0000000000000000 R09: 0000000000000005
R10: 00000000ffffffff R11: 0000000000000000 R12: 00000000000009e4
R13: 00000000000009e4 R14: 0000000000000001 R15: 00000000000009c3
FS: 00002aaaaaeb8700(0000) GS:ffffffff805a2080(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00005555556729c0 CR3: 000000003d844000 CR4: 00000000000006e0
Process runscript.sh (pid: 6648, threadinfo ffff81003d548000, task
ffff81003e8ee860)
Stack: 0000003000000010 ffff81003d5498a8 ffff81003d5497e8 00000000000019f8
ffffffff804081b0 ffffffff80441d90 0000000000000000 0000000000000000
0000000000000000 0000000000000003
Call Trace:<ffffffff801a0348>{get_current_log_flags+8}
<ffffffff801eda78>{reiser4_block_count+72}
<ffffffff801a00a9>{schedulable+9}
<ffffffff802617a0>{load_and_lock_bnode+96}
<ffffffff80260ff6>{parse_blocknr+326}
<ffffffff8019fd55>{reiser4_print_prefix+133}
<ffffffff8019fc29>{report_err+9}
<ffffffff802627a2>{check_blocks_bitmap+1042}
<ffffffff801c62b6>{reiser4_check_block+22}
<ffffffff801a7df0>{zget+1088}
<ffffffff80200c27>{do_reiser4_file_readahead+999}
<ffffffff8015dd63>{handle_mm_fault+1107}
<ffffffff80201090>{reiser4_file_readahead+240}
<ffffffff8026b627>{read_unix_file+775}
<ffffffff80235d20>{read_tail+0}
<ffffffff8026911c>{unix_file_filemap_nopage+236}
<ffffffff801bea75>{done_context+741}
<ffffffff801fb1f1>{reiser4_read+689}
<ffffffff8015cb79>{do_wp_page+153}
<ffffffff8016cc27>{vfs_read+199}
<ffffffff8016cf13>{sys_read+83}
<ffffffff8010e16a>{system_call+126}
Code: 0f 0b aa 1a 42 80 ff ff ff ff 83 00 48 c7 c6 40 46 56 80 48
RIP <ffffffff801a0070>{reiser4_do_panic+768} RSP <ffff81003d5497c8>
Segmentation fault
--------------------------------------------------------
A few things to note about the circumstance would be that
- I had restarted without properly umounting the drives a few times
before the bug occured (i'm not so sure how much fsck checking is
happening at boot-time, it doesn't seem like much.)
- Everything worked fine after the bug until...
- Once I tried to stop gpm again, it went into an infinite loop of
outputting errors (the syslogger died, so I couldn't get them) much like
the previous error.
Also, if anyone has any suggestions, based upon either of my past two
e-mails about anything to try (to either fix, or cause the bug to occur
again) please feel free to suggest it - the output from these really tells
me next to nothing so I'm kind of working blind here.
Anyways, I hope that output is a bit more helpful,
Isaac
Also, sorry about responding to myself three times now - it's just that it
took me so long to get another bug that I figured an update was in order
(not like it's so bad compared to the real spam on the list).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: AMD64/Reiser4 testing and problems
2005-01-01 21:49 Recursive modfied-timestamp? Hans Reiser
@ 2005-01-02 4:22 ` Isaac Chanin
0 siblings, 0 replies; 6+ messages in thread
From: Isaac Chanin @ 2005-01-02 4:22 UTC (permalink / raw)
To: reiserfs-list
Hello all,
Just responding to my previous messages a bit more. Not too much new to
say, aside from a bunch of new bug report/error messages.
If you're interested they're at http://users.wpi.edu/~chanin/r4more.txt.
The old 'random' bug is still popping up. Definitely looks like it has
something to do with the reiser4_find_next_zero_bit function in bitmap.c.
I've looked through the file (and includes) and haven't found anything
obvious - but my C skills are quite what they should be for debugging
something like this.
Also, there appears to be a new bug, or perhaps simple fluke event that
resulted in some random file courruption - I've yet to formulate
uninformed opnions about what caused that one yet, however.
Finally, if there's no need for more bug reports - apparently my last one
did not warrant a patch or response (or some people just enjoy the season
more than I do) - feel free to tell me. I do recall reading that a
x86_64 machine would be on its way to namesys soon.
Thanks,
Isaac
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-01-02 4:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-02 23:13 AMD64/Reiser4 testing and problems Isaac Chanin
2004-12-04 21:27 ` Alex Zarochentsev
-- strict thread matches above, loose matches on Subject: below --
2004-12-08 8:04 Isaac Chanin
2004-12-21 5:46 Isaac Chanin
2004-12-22 5:45 ` Isaac Chanin
2005-01-01 21:49 Recursive modfied-timestamp? Hans Reiser
2005-01-02 4:22 ` AMD64/Reiser4 testing and problems Isaac Chanin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.