All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Osterlund <petero2@telia.com>
To: Andrew Morton <akpm@osdl.org>
Cc: axboe@suse.de, packet-writing@suse.com, linux-kernel@vger.kernel.org
Subject: Re: ext2 on a CD-RW
Date: 02 Jan 2004 02:30:12 +0100	[thread overview]
Message-ID: <m2llorkuhn.fsf@telia.com> (raw)
In-Reply-To: <20040101162427.4c6c020b.akpm@osdl.org>

Andrew Morton <akpm@osdl.org> writes:

> Peter Osterlund <petero2@telia.com> wrote:
> >
> > If I create an ext2 filesystem
> >  with 2kb blocksize, I hit a bug when I try to write some large files to
> >  the filesystem. The problem is that the code in mpage_writepage() fails if
> >  a page is mapped to disk across a packet boundary. In that case, the
> >  bio_add_page() call at line 543 in mpage.c can fail even if the bio was
> >  previously empty. The code then passes an empty bio to submit_bio(), which
> >  triggers a bug at line 2303 in ll_rw_blk.c. This patch seems to fix the
> >  problem.
> > 
> >  --- linux/fs/mpage.c.old	2004-01-02 00:26:19.000000000 +0100
> >  +++ linux/fs/mpage.c	2004-01-02 00:26:50.000000000 +0100
> >  @@ -541,6 +541,11 @@
> >   
> >   	length = first_unmapped << blkbits;
> >   	if (bio_add_page(bio, page, length, 0) < length) {
> >  +		if (!bio->bi_size) {
> >  +			bio_put(bio);
> >  +			bio = NULL;
> >  +			goto confused;
> >  +		}
> >   		bio = mpage_bio_submit(WRITE, bio);
> >   		goto alloc_new;
> >   	}
> 
> Confused.  We initially have an empty BIO, and we run bio_add_page()
> against it, adding one page.
> 
> How can that bio_add_page() fail to add the page?
> 
> Cold you describe the failure a little more please?

In bio_add_page(), there is this check:

	/*
	 * if queue has other restrictions (eg varying max sector size
	 * depending on offset), it can specify a merge_bvec_fn in the
	 * queue to get further control
	 */
	if (q->merge_bvec_fn) {
		/*
		 * merge_bvec_fn() returns number of bytes it can accept
		 * at this offset
		 */
		if (q->merge_bvec_fn(q, bio, bvec) < len) {
			bvec->bv_page = NULL;
			bvec->bv_len = 0;
			bvec->bv_offset = 0;
			return 0;
		}
	}

The packet writing code has the restriction that a bio must not span a
packet boundary. (A packet is 32*2048 bytes.) If the page when mapped
to disk starts 2kb before a packet boundary, merge_bvec_fn therefore
returns 2048, which is less than len, which is 4096 if the whole page
is mapped, so the bio_add_page() call fails.

I don't think this case ever happens with the standard kernel, because
it doesn't (afaik) contain any merge_bvec_fn implementations that are
as restrictive as the one in the cdrw packet writing code. I still
think the code in mpage_writepage() is wrong, even if it happens to
work in the standard kernel.

> >  I noted that performance is quite bad with 2kb blocksize. It is a lot
> >  faster with 4kb blocksize. (2kb blocksize with the udf filesystem is not
> >  slow, only ext2 seems to have this problem.) Maybe the "confused" case
> >  (which calls a_ops->writepage()) in mpage_writepage() isn't really meant
> >  to be fast. Is there a better way to fix this problem?
> 
> How much slower is it?

I didn't make any exact measurements, but it looked like the pageout
code didn't throw enough parallel bios at the packet writing request
queue, so a lot of incomplete packets had to be written to disk. This
is very costly, because the code then has to do read/modify/write
cycles instead of just writing the whole packet directly.

A rough guess is that 2kb block size was 2-3 times slower than 4kb
block size, but I'll make more tests later to see what's really going
on here.

-- 
Peter Osterlund - petero2@telia.com
http://w1.894.telia.com/~u89404340

  reply	other threads:[~2004-01-02  1:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-01 23:47 ext2 on a CD-RW Peter Osterlund
2004-01-02  0:24 ` Andrew Morton
2004-01-02  1:30   ` Peter Osterlund [this message]
2004-01-02  9:06     ` Arjan van de Ven
2004-01-02 10:09       ` Jens Axboe
2004-01-02 10:51       ` Peter Osterlund
2004-01-02 10:59         ` Jens Axboe
2004-01-02 12:09           ` Peter Osterlund
2004-01-02 12:19             ` Jens Axboe
2004-01-02 13:38               ` Peter Osterlund
2004-01-02 13:59                 ` Jens Axboe
2004-01-02 16:12           ` Peter Osterlund
2004-03-29 15:38           ` Peter Osterlund
2004-01-02 10:08     ` Jens Axboe
2004-01-03 19:14 ` Pavel Machek
2004-01-03 20:32   ` Peter Osterlund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2llorkuhn.fsf@telia.com \
    --to=petero2@telia.com \
    --cc=akpm@osdl.org \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=packet-writing@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.