All of lore.kernel.org
 help / color / mirror / Atom feed
* newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS
@ 2005-07-19 11:47 Paul Slootman
  2005-07-28 13:56 ` Vitaly Fertman
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Slootman @ 2005-07-19 11:47 UTC (permalink / raw)
  To: reiserfs-list

This is on a dual-CPU opteron system, with 2 x 3ware 9500 12-channel
SATA controllers for a total of 8.5TB; I've configured a RAID5 over each
3ware controller, and use linux md RAID0 over those two "devices".
There was an issue with linux md RAID0 for that size, but that's been
resolved (at least, the problem I had first :-)

The device itself seems to work fine, as reiser4 works. I
wanted to compare to reiserfs 3.6, so I created a reiserfs, mounted it,
and tried to use it. Running bonnie++ on it caused an oops, apparently
in the reiserfs code:

kernel: Unable to handle kernel paging request at 00002aaaaaad79ea RIP:
kernel: <ffffffff801b0ac1>{scan_bitmap_block+129}
kernel: PGD 954ad067 PUD 9a1af067 PMD b75bc067 PTE 0
kernel: Oops: 0000 [1] SMP
kernel: CPU 0
kernel: Modules linked in: raid0 reiser4 zlib_deflate zlib_inflate raid5 raid6 xor ipv6 evdev tg3 3w_9xxx hw_random i2c_amd756 i2c_amd8111 i2c_core psmouse rtc
kernel: Pid: 12006, comm: bonnie++ Not tainted 2.6.12.2.raid0fixreiser4
kernel: RIP: 0010:[scan_bitmap_block+129/768] <ffffffff801b0ac1>{scan_bitmap_block+129}
kernel: RSP: 0018:ffff81007f461a18  EFLAGS: 00010286
kernel: RAX: 00002aaaaaad79ea RBX: 000000000000001c RCX: 00000000000021e0
kernel: RDX: 0000000000000000 RSI: 000000000000001c RDI: ffff81007f461d98
kernel: RBP: ffffc2000050a1c0 R08: 0000000000000001 R09: 0000000000000011
kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff81007f461a9c
kernel: R13: 00000000000021e0 R14: ffff81007f0ffc00 R15: 0000000000000011
kernel: FS:  00002aaaab26f8c0(0000) GS:ffffffff804b7480(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
kernel: CR2: 00002aaaaaad79ea CR3: 00000000d9d11000 CR4: 00000000000006e0
kernel: Process bonnie++ (pid: 12006, threadinfo ffff81007f460000, task ffff81007feee0f0)
kernel: Stack: 0000000000000010 00000000801cabce 0000001c00000001 ffff81007f461d98
kernel:        ffff81009f6c9018 000000000000001c ffff81007f0ffc00 000000000000001c
kernel:        ffff81007f461d98 0000000000000001
kernel: Call Trace:<ffffffff801b10a9>{scan_bitmap+585} <ffffffff801b2313>{reiserfs_allocate_blocknrs+803}
kernel:        <ffffffff801bc08c>{reiserfs_allocate_blocks_for_region+524}
kernel:        <ffffffff80178f36>{alloc_page_buffers+102} <ffffffff801ca1a8>{pathrelse+40}
kernel:        <ffffffff801484b0>{autoremove_wake_function+0} <ffffffff801be315>{reiserfs_file_write+1045}
kernel:        <ffffffff801655ad>{do_anonymous_page+861} <ffffffff80165b90>{handle_mm_fault+304}
kernel:        <ffffffff80203dd1>{__up_read+33} <ffffffff8011e289>{do_page_fault+601}
kernel:        <ffffffff8032a7d3>{_spin_lock+3} <ffffffff80176c90>{vfs_write+192}
kernel:        <ffffffff80176de3>{sys_write+83} <ffffffff8010d91a>{system_call+126}
kernel:
kernel:
kernel: Code: 8b 00 48 c1 e8 02 a8 01 74 17 49 8b 86 70 02 00 00 48 ff 80
kernel: RIP <ffffffff801b0ac1>{scan_bitmap_block+129} RSP <ffff81007f461a18>
kernel: CR2: 00002aaaaaad79ea


I rebooted (hard, as a shutdown didn't work...). After that, I tried a
mkfs followd by an fsck, which gives an error! Here's the console log:


satazilla:~# mkfs.reiserfs /dev/md13
mkfs.reiserfs 3.6.19 (2003 www.namesys.com)

A pair of credits:
Joshua Macdonald wrote the first draft of the transaction manager. Yuri Rupasov
did testing  and benchmarking,  plus he invented the r5 hash  (also used by the
dcache  code).  Yura  Rupasov,  Anatoly Pinchuk,  Igor Krasheninnikov,  Grigory
Zaigralin,  Mikhail  Gilula,   Igor  Zagorovsky,  Roman  Pozlevich,  Konstantin
Shvachko, and Joshua MacDonald are former contributors to the project.

The  Defense  Advanced  Research  Projects Agency (DARPA, www.darpa.mil) is the
primary sponsor of Reiser4.  DARPA  does  not  endorse  this project; it merely 
sponsors it.


Guessing about desired format.. Kernel 2.6.12.2.raid0fixreiser4 is running.
Format 3.6 with standard journal
Count of blocks on the device: 2148377056
Number of blocks consumed by mkreiserfs formatting process: 8239
Blocksize: 4096
Hash function used to sort names: "r5"
Journal Size 8193 blocks (first block 18)
Journal Max transaction length 1024
inode generation number: 0
UUID: 10ae60ee-1abb-49d1-ae55-cf238626c0b5
ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
        ALL DATA WILL BE LOST ON '/dev/md13'!
Continue (y/n):y
Initializing journal - 0%....20%....40%....60%....80%....100%
Syncing..ok

Tell your friends to use a kernel based on 2.4.18 or later, and especially not a
kernel based on 2.4.9, when you use reiserFS. Have fun.

ReiserFS is successfully created on /dev/md13.
satazilla:~# reiserfsck  /dev/md13
reiserfsck 3.6.19 (2003 www.namesys.com)

*************************************************************
** If you are using the latest reiserfsprogs and  it fails **
** please  email bug reports to reiserfs-list@namesys.com, **
** providing  as  much  information  as  possible --  your **
** hardware,  kernel,  patches,  settings,  all reiserfsck **
** messages  (including version),  the reiserfsck logfile, **
** check  the  syslog file  for  any  related information. **
** If you would like advice on using this program, support **
** is available  for $25 at  www.namesys.com/support.html. **
*************************************************************

Will read-only check consistency of the filesystem on /dev/md13
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Tue Jul 19 13:29:22 2005
###########
Replaying journal..
No transactions found
reiserfs_open_ondisk_bitmap: wrong either bitmaps number,
count of blocks or blocksize, run with --rebuild-sb to fix it
reiserfsck: Could not open bitmap


When I try running with --rebuild-sb it says:

Reiserfs super block in block 16 on 0x90d of format 3.6 with standard journal
Count of blocks on the device: 2148377056
Number of bitmaps: 28
Blocksize: 4096
Free blocks (count of blocks - used [journal, bitmaps, data, reserved] blocks): 2148368817
Root block: 8211
Filesystem is clean
Tree height: 2
Hash function used to sort names: "r5"
Objectid map size 2, max 972
Journal parameters:
        Device [0x0]
        Magic [0x168ed58c]
        Size 8193 blocks (including 1 for journal header) (first block 18)
        Max transaction length 1024 blocks
        Max batch size 900 blocks
        Max commit age 30
Blocks reserved by journal: 0
Fs state field: 0x0:
sb_version: 2
inode generation number: 0
UUID: 10ae60ee-1abb-49d1-ae55-cf238626c0b5
LABEL: 
Set flags in SB:
        ATTRIBUTES CLEAN

Super block seems to be correct



Something seems seriously wrong here.
I'm happy to run any tests or try any patches, this system is mine to play with
until the end of the month.


Paul Slootman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS
  2005-07-19 11:47 newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS Paul Slootman
@ 2005-07-28 13:56 ` Vitaly Fertman
  2005-07-28 16:36   ` Jeff Mahoney
  0 siblings, 1 reply; 4+ messages in thread
From: Vitaly Fertman @ 2005-07-28 13:56 UTC (permalink / raw)
  To: reiserfs-list

Hello, 

On Tuesday 19 July 2005 15:47, Paul Slootman wrote:
> This is on a dual-CPU opteron system, with 2 x 3ware 9500 12-channel
> SATA controllers for a total of 8.5TB; I've configured a RAID5 over each
> 3ware controller, and use linux md RAID0 over those two "devices".
> There was an issue with linux md RAID0 for that size, but that's been
> resolved (at least, the problem I had first :-)
> 
> The device itself seems to work fine, as reiser4 works. I
> wanted to compare to reiserfs 3.6, so I created a reiserfs, mounted it,
> and tried to use it. Running bonnie++ on it caused an oops, apparently
> in the reiserfs code:
> 
> I rebooted (hard, as a shutdown didn't work...). After that, I tried a
> mkfs followd by an fsck, which gives an error! Here's the console log:
> 
> 
> satazilla:~# mkfs.reiserfs /dev/md13
> mkfs.reiserfs 3.6.19 (2003 www.namesys.com)
> 
> Guessing about desired format.. Kernel 2.6.12.2.raid0fixreiser4 is running.
> Format 3.6 with standard journal
> Count of blocks on the device: 2148377056

ahh, indeed, this amount of blocks needs 65564 bitmap count,
whereas there is only 16 bits field in the super block for 
the bitmap count. in other words, this limits the reiserfs 
size to: 65535 * BlockSize * 8 * Blocksize, for BlockSize 
== 4K it is 8T. 

the check for bitmap block count overflow seems to be missed 
in progs. hmm, and our faq about 16Tb is not correct also...

> Number of blocks consumed by mkreiserfs formatting process: 8239
> Blocksize: 4096
> Hash function used to sort names: "r5"
> Journal Size 8193 blocks (first block 18)
> Journal Max transaction length 1024
> inode generation number: 0
> UUID: 10ae60ee-1abb-49d1-ae55-cf238626c0b5
> ATTENTION: YOU SHOULD REBOOT AFTER FDISK!
>         ALL DATA WILL BE LOST ON '/dev/md13'!
> Continue (y/n):y
> Initializing journal - 0%....20%....40%....60%....80%....100%
> Syncing..ok
> 
> Tell your friends to use a kernel based on 2.4.18 or later, and especially not a
> kernel based on 2.4.9, when you use reiserFS. Have fun.
> 
> ReiserFS is successfully created on /dev/md13.
> satazilla:~# reiserfsck  /dev/md13
> reiserfsck 3.6.19 (2003 www.namesys.com)
> 
> Will read-only check consistency of the filesystem on /dev/md13
> Will put log info to 'stdout'
> 
> Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
> ###########
> reiserfsck --check started at Tue Jul 19 13:29:22 2005
> ###########
> Replaying journal..
> No transactions found
> reiserfs_open_ondisk_bitmap: wrong either bitmaps number,
> count of blocks or blocksize, run with --rebuild-sb to fix it
> reiserfsck: Could not open bitmap


 
> When I try running with --rebuild-sb it says:
> 
> Reiserfs super block in block 16 on 0x90d of format 3.6 with standard journal
> Count of blocks on the device: 2148377056
> Number of bitmaps: 28
> Blocksize: 4096
> Free blocks (count of blocks - used [journal, bitmaps, data, reserved] blocks): 2148368817
> Root block: 8211
> Filesystem is clean
> Tree height: 2
> Hash function used to sort names: "r5"
> Objectid map size 2, max 972
> Journal parameters:
>         Device [0x0]
>         Magic [0x168ed58c]
>         Size 8193 blocks (including 1 for journal header) (first block 18)
>         Max transaction length 1024 blocks
>         Max batch size 900 blocks
>         Max commit age 30
> Blocks reserved by journal: 0
> Fs state field: 0x0:
> sb_version: 2
> inode generation number: 0
> UUID: 10ae60ee-1abb-49d1-ae55-cf238626c0b5
> LABEL: 
> Set flags in SB:
>         ATTRIBUTES CLEAN
> 
> Super block seems to be correct
> 
> 
> 
> Something seems seriously wrong here.
> I'm happy to run any tests or try any patches, this system is mine to play with
> until the end of the month.
> 
> 
> Paul Slootman
> 
> 

-- 
Thanks,
Vitaly Fertman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS
  2005-07-28 13:56 ` Vitaly Fertman
@ 2005-07-28 16:36   ` Jeff Mahoney
  2005-07-28 17:15     ` Vladimir V. Saveliev
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Mahoney @ 2005-07-28 16:36 UTC (permalink / raw)
  To: Vitaly Fertman; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vitaly Fertman wrote:
> Hello, 
> 
> On Tuesday 19 July 2005 15:47, Paul Slootman wrote:
>>This is on a dual-CPU opteron system, with 2 x 3ware 9500 12-channel
>>SATA controllers for a total of 8.5TB; I've configured a RAID5 over each
>>3ware controller, and use linux md RAID0 over those two "devices".
>>There was an issue with linux md RAID0 for that size, but that's been
>>resolved (at least, the problem I had first :-)
>>
>>The device itself seems to work fine, as reiser4 works. I
>>wanted to compare to reiserfs 3.6, so I created a reiserfs, mounted it,
>>and tried to use it. Running bonnie++ on it caused an oops, apparently
>>in the reiserfs code:
>>
>>I rebooted (hard, as a shutdown didn't work...). After that, I tried a
>>mkfs followd by an fsck, which gives an error! Here's the console log:
>>
>>
>>satazilla:~# mkfs.reiserfs /dev/md13
>>mkfs.reiserfs 3.6.19 (2003 www.namesys.com)
>>
>>Guessing about desired format.. Kernel 2.6.12.2.raid0fixreiser4 is running.
>>Format 3.6 with standard journal
>>Count of blocks on the device: 2148377056
> 
> ahh, indeed, this amount of blocks needs 65564 bitmap count,
> whereas there is only 16 bits field in the super block for 
> the bitmap count. in other words, this limits the reiserfs 
> size to: 65535 * BlockSize * 8 * Blocksize, for BlockSize 
> == 4K it is 8T. 
> 
> the check for bitmap block count overflow seems to be missed 
> in progs. hmm, and our faq about 16Tb is not correct also...

Out of curiousity, why is the number of bitmaps even needed if it can be
calculated?

If that's truly the limiting factor, could we perhaps set s_bmap_nr = 0
and calculate the number of bitmaps at mount time? The s_bmap_nr = 0
would ensure that a mount of the filesystem on a kernel unaware of the
larger size would fail since it would fail allocating memory to store
the buffer heads.

It's not friendly, but neither is advertising a 16TB filesystem size,
when there is a limit at 8TB on most systems.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFC6QmpLPWxlyuTD7IRApKgAJ9djDA5MrAWrnT8T/JwobMMankNwgCfRKY6
lE0x+U5lemBsw0k8G8iwCHc=
=SdkN
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS
  2005-07-28 16:36   ` Jeff Mahoney
@ 2005-07-28 17:15     ` Vladimir V. Saveliev
  0 siblings, 0 replies; 4+ messages in thread
From: Vladimir V. Saveliev @ 2005-07-28 17:15 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Vitaly Fertman, reiserfs-list

Hello

Jeff Mahoney wrote:
> Vitaly Fertman wrote:
> 
>>>Hello, 
>>>
>>>On Tuesday 19 July 2005 15:47, Paul Slootman wrote:
>>>
>>>>This is on a dual-CPU opteron system, with 2 x 3ware 9500 12-channel
>>>>SATA controllers for a total of 8.5TB; I've configured a RAID5 over each
>>>>3ware controller, and use linux md RAID0 over those two "devices".
>>>>There was an issue with linux md RAID0 for that size, but that's been
>>>>resolved (at least, the problem I had first :-)
>>>>
>>>>The device itself seems to work fine, as reiser4 works. I
>>>>wanted to compare to reiserfs 3.6, so I created a reiserfs, mounted it,
>>>>and tried to use it. Running bonnie++ on it caused an oops, apparently
>>>>in the reiserfs code:
>>>>
>>>>I rebooted (hard, as a shutdown didn't work...). After that, I tried a
>>>>mkfs followd by an fsck, which gives an error! Here's the console log:
>>>>
>>>>
>>>>satazilla:~# mkfs.reiserfs /dev/md13
>>>>mkfs.reiserfs 3.6.19 (2003 www.namesys.com)
>>>>
>>>>Guessing about desired format.. Kernel 2.6.12.2.raid0fixreiser4 is running.
>>>>Format 3.6 with standard journal
>>>>Count of blocks on the device: 2148377056
>>>
>>>ahh, indeed, this amount of blocks needs 65564 bitmap count,
>>>whereas there is only 16 bits field in the super block for 
>>>the bitmap count. in other words, this limits the reiserfs 
>>>size to: 65535 * BlockSize * 8 * Blocksize, for BlockSize 
>>>== 4K it is 8T. 
>>>
>>>the check for bitmap block count overflow seems to be missed 
>>>in progs. hmm, and our faq about 16Tb is not correct also...
> 
> 
> Out of curiousity, why is the number of bitmaps even needed if it can be
> calculated?
> 
Well, usually, at least for me, when you look at the code you wrote some time ago (8 years for example)
you always wonder "how me could write that".
So, reiserfs could go just fine without it.

> If that's truly the limiting factor, could we perhaps set s_bmap_nr = 0
> and calculate the number of bitmaps at mount time? The s_bmap_nr = 0
> would ensure that a mount of the filesystem on a kernel unaware of the
> larger size would fail since it would fail allocating memory to store
> the buffer heads.
> 
> It's not friendly, but neither is advertising a 16TB filesystem size,
> when there is a limit at 8TB on most systems.
> 
> -Jeff
> 
> --
> Jeff Mahoney
> SuSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-28 17:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-19 11:47 newly created 8.5TB reiserfs fails fsck on amd64 and causes OOPS Paul Slootman
2005-07-28 13:56 ` Vitaly Fertman
2005-07-28 16:36   ` Jeff Mahoney
2005-07-28 17:15     ` Vladimir V. Saveliev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.