From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Zarochentsev Subject: Re: Reiser4 SCSI Bug? Date: Mon, 31 Oct 2005 10:24:25 +0400 Message-ID: <200510310924.26782.zam@namesys.com> References: <2066.130.215.239.65.1130521750.squirrel@webmail.WPI.EDU> <43639DFA.1020107@namesys.com> <43640228.7080807@wpi.edu> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: list-help: list-unsubscribe: list-post: Errors-To: flx@namesys.com In-Reply-To: <43640228.7080807@wpi.edu> Content-Disposition: inline List-Id: Content-Type: text/plain; charset="us-ascii" To: Isaac Chanin Cc: Hans Reiser , Steve Olivieri , reiserfs-list@namesys.com, vs Hi, On Sunday 30 October 2005 02:13, Isaac Chanin wrote: > Hi Hans, > > I don't think it's a device driver problem. My fix derived simply from > noticing that the function where the problem originates is bvec_alloc_bs > in fs/bio.c. > > To trace, the problem goes something like: > (fs/reiser4/wander.c) write_jnodes_to_disk_extent -> (fs/bio.c) > bio_alloc -> bio_alloc_bioset -> bvec_alloc_bs > > In any case, BIO_MAX_PAGES is defined to be 256 in include/linux/bio.h, > thus bvec_alloc_bs returns null, likewise bio_alloc_bioset does the > same, as does bio_alloc. And thus the error in wander.c. > > Looking at the problem from the other direction, the max_blocks check > that was already in wander.c looks like: > > max_blocks = bdev_get_queue(super->s_bdev)->max_sectors >> > (super->s_blocksize_bits - 9); > > I'd look into the construction of the super block around > s_blocksize_bits, or however max_sectors is derived, to see if there're > any problems there (I suppose a device driver very well could be giving > a bad value that gets used somewhere in that code), but I really don't > know the reiser4 code well enough to do so. In any case, as a temporary > (or perhaps permanent, as it seems that the limit is hardcoded in the > kernel anyways) solution it would probably be better to put the min(256, > ...) check around where max_blocks is set rather than around nr_blocks > inside the while loop. fs/mpage.c uses bio.h:bio_get_nr_vecs() as below: ... alloc_new: if (bio == NULL) { bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9), min_t(int, nr_pages, bio_get_nr_vecs(bdev)), GFP_KERNEL); if (bio == NULL) goto confused; } ... would you please check whether the following patch helps as your one does: ----------------------------------------- diff --git a/wander.c b/wander.c index 4038000..b606063 100644 --- a/wander.c +++ b/wander.c @@ -731,9 +731,7 @@ static int write_jnodes_to_disk_extent( assert("zam-570", nr > 0); block = *block_p; - max_blocks = - bdev_get_queue(super->s_bdev)->max_sectors >> - (super-> s_blocksize_bits - 9); + max_blocks = bio_get_nr_vecs(super->s_bdev); while (nr > 0) { struct bio *bio; ====================== > > > Isaac > > Hans Reiser wrote: > > Thanks Isaac! > > > > Does IDE have a different maximum? Is there a non-hack way to discover > > the maximum of the device driver underneath? > > > > Hans > > > > Isaac Chanin wrote: > >>Steve Olivieri wrote: > >>>Greetings! > >>> > >>>I have a Maxtor Atlas II 15k, 147GB SCSI hard disk and an LSI Logic > >>>LSI21320 Ultra320 SCSI Host Bus Adapter for PCI-X (though it's running > >>> in a PCI slot). I have partitioned it and installed a temporary Gentoo > >>> system with the reiserfs filesytem. My boot partition uses ext2. > >>> These partitions seem perfectly stable. I have booted this temporary > >>> system multiple times and I have attempted to use it to install a > >>> permanent system with the reiser4 filesystem. > >>> > >>>After making the fileystems on each partition and mounting them, I was > >>>able to download two tarballs (the stage and portage snapshot). While > >>>extracting the first, the terminal seems to hang. > >>> > >>>I switched to another terminal to see what was wrong. Everything works > >>>great until I try to do anything at all on a reiser4 partition. Then, > >>>this terminal also hangs. Repeat until I'm out of terminals. > >>> > >>>I've been able to repeat this problem a number of times. The only > >>>messages in my log file claimed that a flush failed (with code -12, if I > >>>remember correctly). I do not have access to the logs at this time but > >>> I will try to get them soon. > >>> > >>>I am running a 2.6.13-ck8 kernel patched with the 2.6.13 reiser4 > >>> patchset. Patitions were made using the 1.0.5 reiser4progs/libaal. In > >>> the past, I have also replicated this problem using a 2.6.12-gentoo > >>> kernel with the 2.6.12 patchset and 1.0.5 reiser4progs/libaal. > >>> > >>>Is there something that I can do to fix this problem? Is it a known > >>> bug? > >>> > >>>Also, I am not on the mailing list and I would appreciate it if I could > >>> be cc'd with any information about this problem. > >>> > >>>Thanks, > >>>Steve > >> > >>Hey Steve, > >> > >>The problem appears to be in the fs/reiser4/wander.c file. In short, > >>nr_blocks passed to bio_alloc ends up being too big; in my testing 1024, > >>whereas the maximum the called functions will properly deal with is 256. > >> > >>The attached patch adds a hacky fix for this problem, but perhaps someone > >>who knows the code better should go through and take a look why > >> max_blocks and nr in write_jnodes_to_disk_extent in wander.c do not > >> always have a shared minimum of no greater than 256. > >> > >>Hope this helps, > >>Isaac > >> > >> > >>------------------------------------------------------------------------ > >> > >>--- /mnt/gentoo/usr/src/linux/fs/reiser4/wander.c 2005-10-28 > >> 19:27:02.301541280 -0400 +++ linux/fs/reiser4/wander.c 2005-10-28 > >> 19:29:37.000000000 -0400 @@ -1241,7 +1241,7 @@ > >> > >> while (nr > 0) { > >> struct bio *bio; > >>- int nr_blocks = min(nr, max_blocks); > >>+ int nr_blocks = min(min(nr, max_blocks), 256); > >> int i; > >> int nr_used; -- Alex.